Replica amplification of nucleic acid arrays

ABSTRACT

Disclosed are improved methods of making and using immobilized arrays of nucleic acids, particularly methods for producing replicas of such arrays. Included are methods for producing high density arrays of nucleic acids and replicas of such arrays, as well as methods for preserving the resolution of arrays through rounds of replication. Also included are methods which take advantage of the availability of replicas of arrays for increased sensitivity in detection of sequences on arrays. Improved methods of sequencing nucleic acids immobilized on arrays utilizing single copies of arrays and methods taking further advantage of the availability of replicas of arrays are disclosed. The improvements lead to higher fidelity and longer read lengths of sequences immobilized on arrays. Methods are also disclosed which improve the efficiency of multiplex PCR using arrays of immobilized nucleic acids.

[0001] This application is a continuation in part of U.S. patentapplication Ser. No. 09/143,014, filed Aug. 28, 1998. The applicationclaims the benefit of U.S. Provisional Application No. 60/061,511, filedOct. 10, 1997 and U.S. Provisional Application No. 60/076,570, March 2,1998.

FIELD OF THE INVENTION

[0002] The invention relates in general to the reproducible,mass-production of nucleic acid arrays. The invention also relates tomethods of sequencing nucleic acids on arrays.

BACKGROUND OF THE INVENTION

[0003] Arrays of nucleic acid molecules are of enormous utility infacilitating methods aimed at genomic characterization (such aspolymorphism analysis and high-throughput sequencing techniques),screening of clinical patients or entire pedigrees for the risk ofgenetic disease, elucidation of protein/DNA- or protein/proteininteractions or the assay of candidate pharmaceutical compounds forefficacy; however, such arrays are both labor-intensive and costly toproduce by conventional methods. Highly ordered arrays of nucleic acidfragments are known in the art (Fodor et al., U.S. Pat. No. 5,510,270;Lockhart et al., U.S. Pat. No. 5,556,752). Chetverin and Kramer (WO93/17126) are said to disclose a highly ordered array which may beamplified.

[0004] U.S. Pat. No. 5,616,478 of Chetverin and Chetverina reportedlyclaims methods of nucleic acid amplification, in which pools of nucleicacid molecules are positioned on a support matrix to which they are notcovalently linked. Utermohlen (U.S. Pat. No. 5,437,976) is said todisclose nucleic acid molecules randomly immobilized on a reusablematrix.

[0005] There is need in the art for improved methods of nucleic acidarray design and production. There is also a need in the art for methodswith improved resolution and/or sensitivity for detection of sequenceson nucleic acid arrays. There is also a need in the art for improvedmethods of sequencing the molecules on nucleic acid arrays.

SUMMARY OF THE INVENTION

[0006] The invention provides a method of producing a high density arrayof immobilized nucleic acid molecules, such method comprising the stepsof: 1) creating an array of spots of a nucleic acid capture activitysuch that the spots of said capture activity are separated by a distancegreater than the diameter of the spots, and the size of the spots isless than the diameter of the excluded volume of the nucleic acidmolecule to be captured; 2) contacting the array of spots of nucleicacid capture activity with an excess of nucleic acid molecules with anexcluded volume diameter greater than the diameter of the spots ofnucleic acid capture activity, resulting in an immobilized array ofnucleic acid molecules in which each spot of nucleic acid captureactivity can bind only one nucleic acid molecule with an excluded volumediameter greater than the size of said spots of nucleic acid captureactivity.

[0007] In a preferred embodiment of the invention, the nucleic acidcapture activity may be a hydrophobic compound, an oligonucleotide, anantibody or fragment of an antibody, a protein, a peptide, anintercalator, biotin, avidin, or streptavidin.

[0008] In another embodiment of the invention the immobilized array ofspots of a nucleic acid capture activity are arranged in a predeterminedgeometry.

[0009] In another embodiment, the immobilized spots of a nucleic acidcapture activity are aligned with other microfabricated features.

[0010] The invention also encompasses a method of making a plurality ofa high-density nucleic acid array made using spots of nucleic acidcapture activity as described above.

[0011] The invention provides a method for the detection of a nucleicacid on an array of nucleic acid molecules, such method comprising thesteps of generating a plurality of a nucleic acid molecule array whereinthe nucleic acid molecules of each member of said plurality occupypositions which correspond to those positions occupied by the nucleicacid molecules of each other member of said plurality of a nucleic acidarray, and subjecting one or more members of said plurality, but atleast one less than the total number of said plurality to a method ofsignal detection comprising a signal amplification method which renderssaid member of said plurality of a nucleic acid array non-reusable.

[0012] It is preferred that the signal amplification method comprisesfluorescence measurement.

[0013] In a preferred embodiment the method of detection of a nucleicacid on an array of nucleic acid molecules detects the amount of an RNAexpressed in a first RNA-containing nucleic acid population relative tothat expressed in a second RNA-containing nucleic acid population. Themethod further comprises the steps of preparing a first population offluorescently labeled cDNA using said first population of RNA containingnucleic acid as a template, preparing a second fluorescently labeledcDNA population using said second population of RNA-containing nucleicacid as a template, said second fluorescently labeled cDNA populationbeing labeled with a fluorescent label distinguishable from that used tolabel said first population, contacting a mixture of said firstfluorescently labeled cDNA population and said second fluorescentlylabeled cDNA population with a member of said plurality of nucleic acidarrays under conditions which permit hybridization of said fluorescentlylabeled cDNA populations with nucleic acids immobilized on said membersof said plurality of nucleic acid arrays and detecting the fluorescenceof said first fluorescently labeled population of cDNA and thefluorescence of said second fluorescently labeled population of cDNAhybridized to said member of said plurality of nucleic acid arrays,wherein the relative amount of said first fluorescent label and saidsecond fluorescent label detected on a given nucleic acid feature ofsaid array indicates the relative level of expression of RNA derivedfrom the nucleic acid of that feature in the mRNA-containing cDNApopulations tested.

[0014] In another embodiment the method of detection of a nucleic acidon an array of nucleic acid molecules detects the amount of an RNAexpressed in a first RNA-containing nucleic acid population relative tothat expressed in a second RNA-containing nucleic acid population. Themethod further comprises the steps of preparing a first population offluorescently labeled cDNA using said first population of RNA containingnucleic acid as a template, preparing a second fluorescently labeledcDNA population using said second population of RNA-containing nucleicacid as a template, contacting said first fluorescently labeled cDNApopulation with one member of a plurality of immobilized nucleic acidarrays under conditions which permit hybridization of said fluorescentlylabeled cDNA population with nucleic acid immobilized on said member ofa plurality of immobilized nucleic acid arrays, contacting said secondflourescently labeled cDNA population with another member of the sameplurality of immobilized nucleic acid arrays under conditions whichpermit hybridization of said fluorescently labeled cDNA populations withnucleic acid immobilized on said members of a plurality of immobilizednucleic acid arrays, detecting the intensity of fluorescence on eachmember of said plurality contacted with a fluorescently labeled cDNApopulation, and comparing the intensity of fluorescence detected on eachmember of said plurality of immobilized nucleic acid arrays so tested,to determine the relative expression of mRNA derived from those nucleicacids on the array in the mRNA-containing cDNA populations tested.

[0015] The invention provides a method of preserving the resolution ofnucleic acid features on a first immobilized array during cycles ofarray replication, said method comprising the steps of: a) amplifyingthe features of a first array to yield an array of features with ahemispheric radius, r, and a cross-sectional area, q, at the surfacesupporting said array, such that said features remain essentiallydistinct; b) contacting said array of features with a radius, r, with asupport, maintained at a fixed distance from said first array, saidfixed distance less than r, and such that the cross-sectional area ofthe hemispheric feature, measured at said fixed distance from thesurface supporting said first array is less than q, and such that atleast a subset of nucleic acid molecules produced by said amplifying aretransferred to said support; c) covalently affixing said nucleic acidmolecules to said support to form a replica of said first immobilizedarray, wherein the positions of said nucleic acid molecules on saidreplica correspond to the positions of said nucleic acid molecules ofsaid first array from which they were amplified, and wherein the areasoccupied on the surface of said support by the individual features ofsaid replica are less than the areas occupied on the surface supportingsaid first immobilized array.

[0016] It is preferred that said amplifying be performed by PCR.

[0017] In another embodiment of the method of preserving the resolutionof nucleic acid features on a first immobilized array during cycles ofarray replication, the method is repeated to yield further replicas withpreserved resolution.

[0018] The invention provides a method for determining the nucleotidesequence of the features of an immobilized nucleic acid array, suchmethod comprising the steps of: a) ligating a first double-strandednucleic acid probe to one end of a nucleic acid of a feature of saidarray, said first double stranded nucleic acid probe having arestriction endonuclease recognition site for a restriction endonucleasewhose cleavage site is separate from its recognition site and whichgenerates a protruding strand upon cleavage; b) identifying one or morenucleotides at the end of said polynucleotide by the identity of thefirst double stranded nucleic acid probe ligated thereto or by extendinga strand of the polynucleotide or probe; c) amplifying the features ofsaid array using a primer complementary to said first double strandednucleic acid probe, such that only molecules which have beensuccessfully ligated with said first double stranded nucleic acid probeare amplified to yield an amplified array; d) contacting said amplifiedarray with support such that at least a subset of nucleic acid moleculesproduced by said amplifying are transferred to said support; e)covalently attaching said subset of nucleic acid molecules to saidsupport to form a replica of said amplified array; f) cleaving thenucleic acid features of the array with a nuclease recognizing saidnuclease recognition site of said probe such that the nucleic acid ofthe features is shortened by one or more nucleotides; and g) repeatingsteps (a)-(f) until the nucleotide sequences of the features of saidarray are determined.

[0019] It is preferred that the nucleic acid probe comprises fourcomponents, each component being capable of indicating the presence of adifferent nucleotide in the protruding strand upon ligation. It isfurther preferred that each of the components of the probe is labeledwith a different fluorescent dye and that the different fluorescent dyesare spectrally resolvable.

[0020] In another embodiment of the invention, the features of the arrayare amplified after step (e) and before step (f).

[0021] It is preferred that the amplifying be accomplished by PCR.

[0022] In another embodiment, the method of determining the sequence ofthe features of an immobilized nucleic acid array is modified such that:i) after one or more cycles using said first double stranded nucleicacid probe in step (a), a distinct nucleic acid probe is used, in placeof said first double stranded nucleic probe, said distinct nucleic acidprobe comprising a restriction endonuclease recognition site for arestriction endonuclease whose cleavage site is separated from itsrecognition site, said distinct nucleic acid probe also comprisingsequences such that a primer complementary to said distinct nucleic acidprobe will not hybridize with said first double stranded nucleic acidprobe; and ii) a primer complementary to said distinct nucleic acidprobe is used in place of said primer complementary to said first doublestranded nucleic acid probe in step (c), so that selective amplificationof those features which successfully completed the previous cycle ofrestriction and ligation occurs.

[0023] In another embodiment of this modified method of determining thenucleotide sequence of the features of an immobilized nucleic acidarray, a new distinct nucleic acid probe is used after each cycle ofrestriction and ligation, said new distinct nucleic acid probecomprising a sequence such that a primer complementary to that sequencewill not hybridize to any probe used in previous cycles.

[0024] The invention provides a method of determining the nucleotidesequence of the features of an array of immobilized nucleic acidscomprising the steps of: a) adding a mixture comprising anoligonucleotide primer and a template-dependent polymerase to an arrayof immobilized nucleic acid features under conditions permittinghybridization of the primer to the immobilized nucleic acids; b) addinga single, fluorescently labeled deoxynucleoside triphosphate to themixture under conditions which permit incorporation of the labeleddeoxynucleotide onto the 3′ end of the primer if it is complementary tothe next adjacent base in the sequence to be determined; c) detectingincorporated label by monitoring fluorescence; d) repeating steps(b)-(c) with each of the remaining three labeled deoxynucleosidetriphosphates in turn; and e) repeating steps (b)-(d) until thenucleotide sequence is determined.

[0025] In a preferred embodiment, the primer, buffer and polymerase arecast into a polyacrylamide gel bearing the array of immobilized nucleicacids.

[0026] It is preferred that the single fluorescently labeleddeoxynucleotide further comprises a mixture of the singledeoxynucleoside triphosphate in labeled and unlabeled forms.

[0027] In another embodiment, the additional step of photobleaching saidarray is performed after step (d) and before step (e).

[0028] In another embodiment, the fluorescently labeled deoxynucleosidetriphosphates are labeled with a cleavable linkage to the fluorophore,and the additional step of cleaving said linkage to the fluorophore isperformed after step (d) and before step (e).

[0029] In another embodiment, the oligonucleotide primer comprisessequences permitting formation of a hairpin loop.

[0030] In another embodiment, after a predetermined number of cycles ofsteps (b)-(d), a defined regimen of deoxynucleotide andchain-terminating deoxynucleotide analog addition is performed, suchthat out-of-phase molecules are blocked from further extension cycles,said regimen followed by continued cycles of steps (b)-(d) until thenucleotide sequence of the features of the array is determined.

[0031] The invention provides a method of determining the nucleotidesequence of the features of an array of immobilized nucleic acidscomprising the steps of: a) adding a mixture comprising anoligonucleotide primer and a template-dependent polymerase to an arrayof immobilized nucleic acid features under conditions permittinghybridization of the primer to the immobilized nucleic acids; b) addinga first mixture of three unlabeled deoxynucleoside triphosphates underconditions which permit incorporation of deoxynucleotides to the end ofthe primer if they are complementary to the next adjacent base in thesequence to be determined; c) adding a second mixture of three unlabeleddeoxynucleoside triphosphates, along with buffer and polymerase ifnecessary, said second mixture comprising the deoxynucleosidetriphosphate not included in the mixture of step (b), under conditionswhich permit incorporation of deoxynucleotides to the end of the primerif they are complementary to the next adjacent base in the sequence tobe determined; d) repeating steps (b)-(c) for a predetermined number ofcycles; e) adding a single, fluorescently labeled deoxynucleosidetriphosphate to the mixture under conditions which permit incorporationof the labeled deoxynucleotide onto the 3′ terminus of the primer if itis complementary to the next adjacent base in the sequence to bedetermined; f) detecting incorporated label by monitoring fluorescence;g) repeating steps (e)-(f), with each of the remaining three labeleddeoxynucleoside triphosphates in turn; and h) repeating steps (e)-(g)until the nucleotide sequence is determined.

[0032] It is preferred that for the first or second mixtures of threeunlabeled deoxynucleoside triphosphates, a mixture which comprisesdeoxyguanosine triphosphate further comprises deoxyadenosinetriphosphate.

[0033] In a preferred embodiment, method the primer and polymerase arecast into a polyacrylamide gel bearing the array of immobilized nucleicacids.

[0034] In a preferred embodiment, the single fluorescently labeleddeoxynucleotide further comprises a mixture of the singledeoxynucleoside triphosphate in labeled and unlabeled forms.

[0035] In another embodiment of this method of determining thenucleotide sequence of nucleic acid features on an array, the additionalstep of photobleaching the array is performed after step (g) and beforestep (h).

[0036] In another embodiment of this method of determining thenucleotide sequence of nucleic acid features on an array, thefluorescently labeled deoxynucleoside triphosphates are labeled with acleavable linkage to the fluorophore and after step (g) and before step(h) the additional step of cleaving the linkage to the fluorophore isperformed.

[0037] In another embodiment of this method of determining thenucleotide sequence of nucleic acid features on an array, theoligonucleotide primer comprises sequences permitting formation of ahairpin loop.

[0038] In another embodiment of this method of determining thenucleotide sequence of nucleic acid features on an array, after apredetermined number of cycles of steps (e)-(g), a defined regimen ofdeoxynucleotide and chain-terminating deoxynucleotide analog addition isperformed, such that out-of-phase molecules are blocked from furtherextension cycles, said regimen followed by continued cycles of steps(e)-(g) until said nucleotide sequence is determined.

[0039] The invention provides a method of determining the nucleotidesequence of the features of a micro-array of nucleic acid molecules,said method comprising the steps of: a) creating a micro-array ofnucleic acid features in a linear arrangement within and along one sideof a polyacrylamide gel, said gel further comprising one or moreoligonucleotide primers, and a template-dependent polymerizing activity;b) amplifying the microarray; c) adding a mixture of deoxynucleosidetriphosphates, said mixture comprising each of the four deoxynucleosidetriphosphates dATP, dGTP, dCTP and dTTP, said mixture further comprisingchain-terminating analogs of each of the deoxynucleoside triphosphatesdATP, dGTP, dCTP and dTTP, and said chain-terminating analogs eachdistinguishably labeled with a spectrally distinguishable fluorescentmoiety; d) incubating said mixture with said micro-array underconditions permitting extension of said one or more oligonucleotideprimers; e) electrophoretically separating the products of saidextension within said polyacrylamide gel; and f) determining thenucleotide sequence of the features of said micro-array by detecting thefluorescence of the extended, terminated and separated reaction productswithin the gel.

[0040] It is preferred that the amplifying be performed by PCR.

[0041] In another embodiment, the amplifying may be performed by anisothermal method.

[0042] In another embodiment the microarray of nucleic acid features ina linear arrangement is derived as a replica of features arranged on achromosome.

[0043] In another embodiment the microarray of nucleic acid features ina linear arrangement is derived as a replica of one linear subset offeatures on a separate, non-linear micro-array of nucleic acid features.

[0044] The invention provides a method of simultaneously amplifying aplurality of nucleic acids, said method comprising the steps of: a)creating a micro-array of immobilized oligonucleotide primers; b)incubating the microarray with amplification template and anon-immobilized oligonucleotide primer under conditions allowinghybridization of said template with said oligonucleotide primers; c)incubating the hybridized primers and template with a DNA polymeraseactivity, and deoxynucleotide triphosphates under conditions permittingextension of the primers; d) repeating steps (b) and (c) for a definednumber of cycles to yield a plurality of amplified DNA molecules.

[0045] It is preferred that the non-immobilized oligonucleotide primercomprises a pool of oligonucleotide primers comprised of 5′ and 3′sequence elements, said 5′ sequence element identical in all members ofsaid pool, and said 3′ sequence element containing random sequences.

[0046] It is preferred that the 5′ sequence element comprises arestriction endonuclease recognition sequence.

[0047] In another embodiment, the 5′ sequence element comprises atranscriptional promoter sequence.

[0048] In another embodiment, the immobilized primers are amplifiedbefore step (b).

[0049] In another embodiment, the immobilized oligonucleotide primersare generated from genomic DNA.

[0050] In a preferred embodiment, the microarray, template,non-immobilized primer, and polymerase are cast in a polyacrylamide gel.

[0051] As used herein in reference to nucleic acid arrays, the term“plurality” is defined as designating two or more such arrays, wherein afirst (or “template”) array plus a second array made from it comprise aplurality. When such a plurality comprises more than two arrays, arraysbeyond the second array may be produced using either the first array orany copy of it as a template.

[0052] As used herein, the terms “randomly-patterned” or “random” referto a non-ordered, non-Cartesian distribution (in other words, notarranged at pre-determined points along the x- and y axes of a grid orat defined ‘clock positions’, degrees or radii from the center of aradial pattern) of nucleic acid molecules over a support, that is notachieved through an intentional design (or program by which such adesign may be achieved) or by placement of individual nucleic acidfeatures. Such a “randomly-patterned” or “random” array of nucleic acidsmay be achieved by dropping, spraying, plating or spreading a solution,emulsion, aerosol, vapor or dry preparation comprising a pool of nucleicacid molecules onto a support and allowing the nucleic acid molecules tosettle onto the support without intervention in any manner to directthem to specific sites thereon.

[0053] As used herein, the terms “immobilized” or “affixed” refer tocovalent linkage between a nucleic acid molecule and a support matrix.

[0054] As used herein, the term “array” refers to a heterogeneous poolof nucleic acid molecules that is distributed over a support matrix;preferably, these molecules differing in sequence are spaced at adistance from one another sufficient to permit the identification ofdiscrete features of the array.

[0055] As used herein, the term “heterogeneous” is defined to refer to apopulation or collection of nucleic acid molecules that comprises aplurality of different sequences; it is contemplated that aheterogeneous pool of nucleic acid molecules results from a preparationof RNA or DNA from a cell which may be unfractionated orpartially-fractionated.

[0056] An “unfractionated” nucleic acid preparation is defined as thatwhich has not undergone the selective removal of any sequences presentin the complement of RNA or DNA, as the case may be, of the biologicalsample from which it was prepared. A nucleic acid preparation in whichthe average molecular weight has been lowered by cleaving the componentnucleic acid molecules, but which still retains all sequences, is still“unfractionated” according to this definition, as it retains thediversity of sequences present in the biological sample from which itwas prepared.

[0057] A “partially-fractionated” nucleic acid preparation may haveundergone qualitative size-selection. In this case, uncleaved sequences,such as whole chromosomes or RNA molecules, are selectively retained orremoved based upon size. In addition, a “partially-fractionated”preparation may comprise molecules that have undergone selection throughhybridization to a sequence of interest; alternatively, a“partially-fractionated” preparation may have had undesirable sequencesremoved through hybridization. It is contemplated that a“partially-fractionated” pool of nucleic acid molecules will notcomprise a single sequence that has been enriched after extraction fromthe biological sample to the point at which it is pure, or substantiallypure.

[0058] In this context, “substantially pure” refers to a single nucleicacid sequence that is represented by a majority of nucleic acidmolecules of the pool. Again, this refers to enrichment of a sequence invitro; obviously, if a given sequence is heavily represented in thebiological sample, a preparation containing it is not excluded from useaccording to the invention.

[0059] As used herein, the term “biological sample” refers to a wholeorganism or a subset of its tissues, cells or component parts (e.g.fluids). “Biological sample” further refers to a homogenate, lysate orextract prepared from a whole organism or a subset of its tissues, cellsor component parts, or a fraction or portion thereof. Lastly,“biological sample” refers to a medium, such as a nutrient broth or gelin which an organism has been propagated, which contains cellularcomponents, such as nucleic acid molecules.

[0060] As used herein, the term “organism” refers to all cellularlife-forms, such as prokaryotes and eukaryotes, as well as non-cellular,nucleic acid-containing entities, such as bacteriophage and viruses.

[0061] As used herein, the term “feature” refers to each nucleic acidsequence occupying a discrete physical location on the array; if a givensequence is represented at more than one such site, each site isclassified as a feature. In this context, the term “nucleic acidsequence” may refer either to a single nucleic acid molecule, whetherdouble or single-stranded, to a “clone” of amplified copies of a nucleicacid molecule present at the same physical location on the array or to areplica, on a separate support, of such a clone.

[0062] As used herein, the term “amplifying” refers to production ofcopies of a nucleic acid molecule of the array via repeated rounds ofprimed enzymatic synthesis; “in situ amplification” indicates that suchamplifying takes place with the template nucleic acid moleculepositioned on a support according to the invention, rather than insolution.

[0063] As used herein, the term “support” refers to a matrix upon whichnucleic acid molecules of a nucleic acid array are immobilized;preferably, a support is semi-solid.

[0064] As used herein, the term “semi-solid” refers to a compressiblematrix with both a solid and a liquid component, wherein the liquidoccupies pores, spaces or other interstices between the solid matrixelements.

[0065] As used herein in reference to the physical placement of nucleicacid molecules or features and/or their orientation relative to oneanother on an array of the invention, the terms “correspond” or“corresponding” refer to a molecule occupying a position on a secondarray that is either identical to- or a mirror image of the position ofa molecule from which it was amplified on a first array which served asa template for the production of the second array, or vice versa, suchthat the arrangement of features of the array relative to one another isconserved between arrays of a plurality.

[0066] As implied by the above statement, a first and second array of aplurality of nucleic acid arrays according to the invention may be ofeither like or opposite chirality, that is, the patterning of thenucleic acid arrays may be either identical or mirror-imaged.

[0067] As used herein, the term “replica” refers to any nucleic acidarray that is produced by a printing process according to the inventionusing as a template a first randomly-patterned immobilized nucleic acidarray.

[0068] As used herein, the term “spot” as applied to a component of amicroarray refers to a discrete area of a surface containing a substancedeposited by mechanical or other means.

[0069] As used herein, “excluded volume” refers to the volume of spaceoccupied by a particular molecule to the exclusion of other suchmolecules.

[0070] As used herein, “excess of nucleic acid molecules” refers to anamount of nucleic acid molecules greater than the amount of entities towhich such nucleic acid molecules may bind. An excess may comprise asfew as one molecule more than the number of binding entities, to twicethe number of binding entities, up to 10 times, 100 times, 1000 timesthe number of binding entities or more.

[0071] As used herein, “signal amplification method” refers to anymethod by which the detection of a nucleic acid is accomplished.

[0072] As used herein, a “nucleic acid capture ligand” or “nucleic acidcapture activity” refers to any substance which binds nucleic acidmolecules, either specifically or non-specifically, or which binds anaffinity tag attached to a nucleic acid molecule in such a way as toimmobilize the nucleic acid molecule to a support bearing the captureligand.

[0073] As used herein, “replica-destructive” refers to methods of signalamplification which render an array or replica of an array non-reusable.

[0074] As used herein, the term “non-reusable,” in reference to an arrayor replica of an array, indicates that, due to the nature of detectionmethods employed, the array cannot be replicated nor used for subsequentdetection methods after the first detection method is performed.

[0075] As used herein, the term “essentially distinct” as applied tofeatures of an array refers to the situation where 90% or more of thefeatures of an array are not in contact with other features on the samearray.

[0076] As used herein, the term “preserved” as applied to the resolutionof nucleic acid features on an array means that the features remainessentially distinct after a given process has been performed.

[0077] As used herein, the term “distinguishable” as applied to a label,refers to a labeling moiety which can be detected when among otherlabeling moieties.

[0078] As used herein, the term “spectrally distinguishable” or“spectrally resolvable” as applied to a label, refers to a labelingmoiety which can be detected by its characteristic fluorescentexcitation or emission spectra, one or both of such spectradistinguishing said moiety from other moieties used separately orsimultaneously in the particular method.

[0079] As used herein, the term “chain-terminating analog” refers to anynucleotide analog which, once incorporated onto the 3′ end of a nucleicacid molecule, cannot serve as a substrate for further addition ofnucleotides to that nucleic acid molecule.

[0080] As used herein, the term “type IIS” refers to a restrictionenzyme that cuts at a site remote from its recognition sequence. Suchenzymes are known to cut at a distances from their recognition sitesranging from 0 to 20 base pairs.

[0081] It is preferred that the support is semi-solid.

[0082] Preferably, the semi-solid support is selected from the groupthat includes polyacrylamide, cellulose, polyamide (nylon) andcross-linked agarose, -dextran and -polyethylene glycol.

[0083] It is particularly preferred that amplifying of nucleic acidmolecules of is performed by polymerase chain reaction (PCR).

[0084] Preferably, affixing of nucleic acid molecules to the support isperformed using a covalent linker that is selected from the group thatincludes oxidized 3-methyl uridine, an acrylyl group and hexaethyleneglycol. Additionally, Acrydite oligonucleotide primers may be covalentlyfixed within a polyacrylamide gel.

[0085] It is also contemplated that affixing of nucleic acid moleculesto the support is performed via hybridization of the members of the poolto nucleic acid molecules that are covalently bound to the support.

[0086] As used herein, the term synthetic oligonucleotide refers to ashort (10 to 1,000 nucleotides in length), double- or single-strandednucleic acid molecule that is chemically synthesized or is the productof a biological system such as a product of primed or unprimed enzymaticsynthesis.

DETAILED DESCRIPTION OF THE INVENTION

[0087] The present invention is directed to the synthesis of nucleicacid array chips, methods by which such chips may be reproduced andmethods by which they may be used in diverse applications relating tonucleic acid replication or amplification, genomic characterization,gene expression studies, medical diagnostics and population genetics.The nucleic acid array chips of the replica array has several advantagesover the presently available methods.

[0088] Besides any known sequences or combinatorial sequence thereof, afull genome including unknown DNA sequences can be replicated accordingto the present invention. The size of the nucleic acid fragments orprimers to be replicated can be from about 25-mer to about 9000-mer. Thepresent invention is also quick and cost effective. It takes about onlyabout one week from discovery of an organism to arrange the full genomesequence of the organism onto chips with about $10 per chip. Inaddition, the thickness of the chips is 3000 nm which provides a muchhigher sensitivity. The chips are compatible with inexpensive in situPCR devices, and can be reused as many as 100 times.

[0089] The invention provides for an advance over the arrays ofChetverin and Kramer (WO 93/17126), Chetverin and Chetverina, 1997 (U.S.Pat. No. 5,616,478), and others, in that a method is herein described bywhich to produce a random nucleic acid array both that is covalentlylinked to a support (therefore extensively reusable) and that permitsone to fabricate high-fidelity copies of it without returning to thestarting point of the process, thereby eliminating time-consuming,expensive steps and providing for reproducible results both when thecopies of the array are made and when they are used. It is evident thatthis method is not obvious, despite its great utility. No mention ofreplica plating or printing of amplimers in this context appears to havebeen made in oligonucleotide array patents or papers. There is no methodin the prior art for generating a set of nucleic acid arrays comprisingthe steps of covalently linking a pool of nucleic acid molecules to asupport to form a random array, amplifying the nucleic acid moleculesand subsequently replicating the array.

[0090] While reproducibility of manufacture and durability are not ofsignificant concern in the making of arrays in which the nucleic acidmolecules are chemically synthesized directly on the support, they arecentrally important in cases in which the molecules of the array are ofnatural origin (for example, a sample of mRNA from an organism). Eachnucleic acid sample obtained from a natural source constitutes a uniquepool of molecules; these molecules are, themselves, uniquely distributedover the surface of the support, in that the original laying out of thepattern is random. By any prior art method, an array generated fromsimple, random deposition of a pool of nucleic acid molecules isirreproducible; however, a se of related arrays would be of greatutility, since information derived from any one copy from the replicatedset would increase the confidence in the identity and/or quality of datagenerated using the other members of the set.

[0091] The methods provided in the present invention basically consistsof 5 steps: 1) providing a pool of nucleic acid molecules, 2) plating orother transfer of the pool onto a solid support, 3) in situamplification, 4) replica printing of the amplified nucleic acids and 5)identification of features. Sets of arrays so produced, or membersthereof, then may be put to any chip affinity readout use, some of whichare summarized below. The production of a set of arrays according to theinvention is described in Example 1. The following examples are providedfor exemplification purposes only and are not intended to limit thescope of the invention which has been described in broad terms above.

EXAMPLE 1

[0092] Production of a Plurality Nucleic Acid Array According to theInvention

[0093] Step 1. Production of a Nucleic Acid Pool with which to Constructan Array According to the Invention

[0094] A pool or library of n-mers (n=20 to 9000) is made by any ofseveral methods. The pool is either amplified (e.g. by PCR) or leftunamplified. A suitable in vitro amplification “vector”, for example,flanking PCR primer sequences or an in vivo plasmid, phage or viralvector from which amplified molecules are excised prior to use, is used.If necessary, random shearing or enzymatic cleavage of large nucleicacid molecules is used to generate the pools if the nucleic acidmolecules are amplified, cleavage is performed either before or afteramplification. Alternatively, a nucleic acid sample is random primed,for example with tagged 3′ terminal hexamers followed by electrophoreticsize-selection. The nucleic acid is selected from genomic, synthetic orcDNA sequences (Power, 1996, J. Hosp,. Infect., 34: 247-265; Welsh, etal., 1995, Mutation R, 338: 215-229). The copied or unamplified nucleicacid fragments resulting from any of the above procedures are, ifdesired, fractionated by size or affinity by a variety of methodsincluding electrophoresis, sedimentation, and chromatography (possiblyincluding elaborate, expensive procedures or limited-quantity resourcessince the subsequent inexpensive replication methods can justify suchinvestment of effort).

[0095] Pools of nucleic acid molecules are, at this stage, applieddirectly to the support medium (see Step 2, below). Alternatively, theyare cloned into nucleic acid vectors. For example, pools composed offragments with inherent polarity, such as cDNA molecules, aredirectionally cloned into nucleic acid vectors that comprise, at thecloning site, oligonucleotide linkers that provide asymmetric flankingsequences to the fragments. Upon their subsequent removal viarestriction with enzymes that cleave the vector outside both the clonedfragment and linker sequences, molecules with defined (and different)sequences at their two ends are generated. By denaturing these moleculesand spreading them onto a semi-solid support to which is covalentlybound oligonucleotides that are complementary to one preferred flankinglinker, the orientation of each molecule in the array is determinedrelative to the surface of the support. Such a polar array is of use forin vitro transcription/translation of the array or any purpose for whichdirectional uniformity is preferred.

[0096] In addition to the attachment of linker sequences to themolecules of the pool for use in directional attachment to the support,a restriction site or regulatory element (such as a promoter element,cap site or translational termination signal), is, if desired, joinedwith the members of the pool. The use of fragments with terminiengineered to comprise useful restriction sites is described below inExample 6.

[0097] Step 2. Transfer of the Nucleic Acid Pool onto a Support Medium

[0098] The nucleic acid pool is diluted (“plated”) out onto a semi-solidmedium (such as a polyacrylamide gel) on a solid surface such as a glassslide such that amplifiable molecules are 0.1 to 100 micrometers apart.Sufficient spacing is maintained that features of the array do notcontaminate one another during repeated rounds of amplification andreplication. It is estimated that a molecule that is immobilized at oneend can, at most, diffuse the distance of a single molecule lengthduring each round of replication. Obviously, arrays of shorter moleculesare plated at higher density than those comprising long molecules.

[0099] Immobilizing media that are of use according to the invention arephysically stable and chemically inert under the conditions required fornucleic acid molecule deposition, amplification and the subsequentreplication of the array. A useful support matrix withstands the rapidchanges in- and extremes of temperature required for PCR and retainsstructural integrity under stress during the replica printing process.The support material permits enzymatic nucleic acid synthesis; if it isunknown whether a given substance will do so, it is tested empiricallyprior to any attempt at production of a set of arrays according to theinvention. The support structure comprises a semi-solid (i.e.gelatinous) lattice or matrix, wherein the interstices or pores betweenlattice or matrix elements are filled with an aqueous or other liquidmedium; typical pore (or ‘sieve’) sizes are in the range of 100 μm to 5nm. Larger spaces between matrix elements are within tolerance limits,but the potential for diffusion of amplified products prior to theirimmobilization is increased. The semi-solid support is compressible, sothat full surface-to-surface contact, essentially sufficient to form aseal between two supports, although that is not the object, may beachieved during replica printing. The support is prepared such that itis planar, or effectively so, for the purposes of printing; for example,an effectively planar support might be cylindrical, such that thenucleic acids of the array are distributed over its outer surface inorder to contact other supports, which are either planar or cylindrical,by rolling one over the other. Lastly, a support materials of useaccording to the invention permits immobilizing (covalent linking) ofnucleic acid features of an array to it by means enumerated below.Materials that satisfy these requirements comprise both organic andinorganic substances, and include, but are not limited to,polyacrylamide, cellulose and polyamide (nylon), as well as cross-linkedagarose, dextran or polyethylene glycol.

[0100] Of the support media upon which the members of the pool ofnucleic acid molecules may be anchored, one that is particularlypreferred is a thin, polyacylamide gel on a glass support, such as aplate, slide or chip. A polyacrylamide sheet of this type is synthesizedas follows: Acrylamide and bis-acrylamide are mixed in a ratio that isdesigned to yield the degree of crosslinking between individual polymerstrands (for example, a ratio of 38:2 is typical of sequencing gels)that results in the desired pore size when the overall percentage of themixture used in the gel is adjusted to give the polyacrylamide sheet itsrequired tensile properties. Polyacrylamide gel casting methods are wellknown in the art (see Sambrook et al., 1989, Molecular Cloning. ALaboratory Manual., 2nd Edition, Cold Spring Harbor Laboratory Press,Cold Spring Harbor, N.Y.), and one of skill has no difficulty in makingsuch adjustments.

[0101] The gel sheet is cast between two rigid surfaces, at least one ofwhich is the glass to which it will remain attached after removal of theother. The casting surface that is to be removed after polymerization iscomplete is coated with a lubricant that will not inhibit gelpolymerization; for this purpose, silane is commonly employed. A layerof silane is spread upon the surface under a fume hood and allowed tostand until nearly dry. Excess silane is then removed (wiped or, in thecase of small objects, rinsed extensively) with ethanol. The glasssurface which will remain in association with the gel sheet is treatedwith γ-methacryloxypropyltrimethoxysilane (Cat. No. M6514, Sigma; St.Louis, Mo.), often referred to as ‘crosslink silane’, prior to casting.The glass surface that will contact the gel is triply-coated with thisagent. Each treatment of an area equal to 1200 cm² requires 125 μl ofcrosslink silane in 25 ml of ethanol. Immediately before this solutionis spread over the glass surface, it is combined with a mixture of 750μl water and 75 μl glacial acetic acid and shaken vigorously. Theethanol solvent is allowed to evaporate between coatings (about 5minutes under a fume hood) and, after the last coat has dried, excesscrosslink silane is removed as completely as possible via extensiveethanol washes in order to prevent ‘sandwiching’ of the other supportplate onto the gel. The plates are then assembled and the gel cast asdesired.

[0102] The only operative constraint that determines the size of a gelthat is of use according to the invention is the physical ability of oneof skill in the art to cast such a gel. The casting of gels of up to onemeter in length is, while cumbersome, a procedure well known to workersskilled in nucleic acid sequencing technology. A larger gel, ifproduced, is also of use according to the invention. An extremely smallgel is cut from a larger whole after polymerization is complete.

[0103] Note that at least one procedure for casting a polyacrylamide gelwith bioactive substances, such as enzymes, entrapped within its matrixis known in the art (O'Driscoll, 1976, Methods Enzymol., 44: 169-183); asimilar protocol, using photo-crosslinkable polyethylene glycol resins,that permit entrapment of living cells in a gel matrix has also beendocumented (Nojima and Yamada, 1987, Methods Enzymol., 136: 380-394).Such methods are of use according to the invention. As mentioned below,whole cells are typically cast into agarose for the purpose ofdelivering intact chromosomal DNA into a matrix suitable forpulsed-field gel electrophoresis or to serve as a “lawn” of host cellsthat will support bacteriophage growth prior to the lifting of plaquesaccording to the method of Benton and Davis (see Maniatis et al., 1982,Molecular Cloning: A Laboratory Manual, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.). In short, electrophoresis-gradeagarose (e.g. Ultrapure; Life Technologies/Gibco-BRL; is dissolved in aphysiological (isotonic) buffer and allowed to equilibrate to atemperature of 50° to 52° C. in a tube, bottle or flask. Cells are thenadded to the agarose and mixed thoroughly, but rapidly (if in a bottleor tube, by capping and inversion, if in a flask, by swirling), beforethe mixture is decanted or pipetted into a gel tray. If low-meltingpoint agarose is used, it may be brought to a much lower temperature(down to approximately room temperature, depending upon theconcentration of the agarose) prior to the addition of cells. This isdesirable for some cell types; however, if electrophoresis is to followcell lysis prior to covalent attachment of the molecules of theresultant nucleic acid pool to the support, it is performed underrefrigeration, such as in a 4° to 10° C. ‘cold’ room.

[0104] Immobilization of nucleic acid molecules to the support matrixaccording to the invention is accomplished by any of several procedures.Direct immobilizing, as through use of 3′-terminal tags bearing chemicalgroups suitable for covalent linkage to the support, hybridization ofsingle-stranded molecules of the pool of nucleic acid molecules tooligonucleotide primers already bound to the support or the spreading ofthe nucleic acid molecules on the support accompanied by theintroduction of primers, added either before or after plating, that maybe covalently linked to the support, may be performed. Wherepre-immobilized primers are used, they are designed to capture a broadspectrum of sequence motifs (for example, all possible multimers of agiven chain length, e.g. hexamers), nucleic acids with homology to aspecific sequence or nucleic acids containing variations on a particularsequence motif. Alternatively, the primers encompass a syntheticmolecular feature common to all members of the pool of nucleic acidmolecules, such as a linker sequence (see above).

[0105] Oligonucleotide primers useful according to the invention aresingle-stranded DNA or RNA molecules that are hybridizable to a nucleicacid template to prime enzymatic synthesis of a second nucleic acidstrand. The primer is complementary to a portion of a target moleculepresent in a pool of nucleic acid molecules used in the preparation ofsets of arrays of the invention.

[0106] It is contemplated that such a molecule is prepared by syntheticmethods, either chemical or enzymatic. Alternatively, such a molecule ora fragment thereof is naturally occurring, and is isolated from itsnatural source or purchased from a commercial supplier. Oligonucleotideprimers are 6 to 100, and even up to 1,000, nucleotides in length, butideally from 10 to 30 nucleotides, although oligonucleotides ofdifferent length are of use.

[0107] Typically, selective hybridization occurs when two nucleic acidsequences are substantially complementary (at least about 65%complementary over a stretch of at least 14 to 25 nucleotides,preferably at least about 75%, more preferably at least about 90%complementary). See Kanehisa, M., 1984, Nucleic Acids Res. 12: 203,incorporated herein by reference. As a result, it is expected that acertain degree of mismatch at the priming site is tolerated. Suchmismatch may be small, such as a mono-, di- or tri-nucleotide.Alternatively, it may encompass loops, which we define as regions inwhich mismatch encompasses an uninterrupted series of four or morenucleotides.

[0108] Overall, five factors influence the efficiency and selectivity ofhybridization of the primer to a second nucleic acid molecule. Thesefactors, which are (i) primer length, (ii) the nucleotide sequenceand/or composition, (iii) hybridization temperature, (iv) bufferchemistry and (v) the potential for steric hindrance in the region towhich the primer is required to hybridize, are important considerationswhen non-random priming sequences are designed.

[0109] There is a positive correlation between primer length and boththe efficiency and accuracy with which a primer will anneal to a targetsequence; longer sequences have a higher T_(M) than do shorter ones, andare less likely to be repeated within a given target sequence, therebycutting down on promiscuous hybridization. Primer sequences with a highG-C content or that comprise palindromic sequences tend toself-hybridize, as do their intended target sites, since unimolecular,rather than bimolecular, hybridization kinetics are genererally favoredin solution; at the same time, it is important to design a primercontaining sufficient numbers of G-C nucleotide pairings to bind thetarget sequence tightly, since each such pair is bound by three hydrogenbonds, rather than the two that are found when A and T bases pair.Hybridization temperature varies inversely with primer annealingefficiency, as does the concentration of organic solvents, e.g.formamide, that might be included in a hybridization mixture, whileincreases in salt concentration facilitate binding. Under stringenthybridization conditions, longer probes hybridize more efficiently thando shorter ones, which are sufficient under more permissive conditions.Stringent hybridization conditions typically include salt concentrationsof less than about 1M, more usually less than about 500 mM andpreferably less than about 200 mM. Hybridization temperatures range fromas low as 0° C. to greater than 22° C., greater than about 30° C., and(most often) in excess of about 37° C. Longer fragments may requirehigher hybridization temperatures for specific hybridization. As severalfactors affect the stringency of hybridization, the combination ofparameters is more important than the absolute measure of any one alone.

[0110] Primers are designed with the above first four considerations inmind. While estimates of the relative merits of numerous sequences aremade mentally, computer programs have been designed to assist in theevaluation of these several parameters and the optimization of primersequences. Examples of such programs are “PrimerSelect” of the DNAStar™software package (DNAStar, Inc.; Madison, Wis.) and OLIGO 4.0 (NationalBiosciences, Inc.). Once designed, suitable oligonucleotides areprepared by a suitable method, e.g. the phosphoramidite method describedby Beaucage and Carruthers (1981, Tetrahedron Lett., 22: 1859-1862) orthe triester method according to Matteucci et al. (1981, J. Am. Chem.Soc., 103: 3185), both incorporated herein by reference, or by otherchemical methods using either a commercial automated oligonucleotidesynthesizer or VLSIPS™ technology.

[0111] Two means of crosslinking a nucleic acid molecule to a preferredsupport of the invention, a polyacrylamide gel sheet, will be discussedin some detail. The first (provided by Khrapko et al., 1996, U.S. Pat.No. 5,552,270) involves the 3′ capping of nucleic acid molecules with3-methyl uridine; using this method, the nucleic acid molecules of thelibraries of the present invention are prepared so as to include thismodified base at their 3′ ends. In the cited protocol, an 8%polyacrylamide gel (30:1, acrylamide: bis-acrylamide) sheet 30 μm inthickness is cast and then exposed to 50% hydrazine at room temperaturefor 1 hour; such a gel is also of use according to the presentinvention. The matrix is then air dryed to the extent that it willabsorb a solution containing nucleic acid molecules, as described below.Nucleic acid molecules containing 3-methyl uridine at their 3′ ends areoxidized with 1 mM sodium periodate (NaIO₄) for 10 minutes to 1 hour atroom temperature, precipitated with 8 to 10 volumes of 2% LiClO₄ inacetone and dissolved in water at a concentration of 10 pmol/μl. Thisconcentration is adjusted so that when the nucleic acid molecules arespread upon the support in a volume that covers its surface evenly, yetis efficiently (i.e. completely) absorbed by it, the density of nucleicacid molecules of the array falls within the range discussed above. Thenucleic acid molecules are spread over the gel surface and the platesare placed in a humidified chamber for 4 hours. They are then dried for0.5 hour at room temperature and washed in a buffer that is appropriateto their subsequent use. Alternatively, the gels are rinsed in water,re-dried and stored at −20° C. until needed. It is said that the overallyield of nucleic acid that is bound to the gel is 80% and that of thesemolecules, 98% are specifically linked through their oxidized 3′ groups.

[0112] A second crosslinking moiety that is of use in attaching nucleicacid molecules covalently to a polyacrylamide sheet is a 5′ acrylylgroup, which is attached to the primers used in Example 6.Oligonucleotide primers bearing such a modified base at their 5′ endsmay be used according to the invention. In particular, sucholigonucleotides are cast directly into the gel, such that the acrylylgroup becomes an integral, covalently-bonded part of the polymerizingmatrix. The 3′ end of the primer remains unbound, so that it is free tointeract with- and hybridize to a nucleic acid molecule of the pool andprime its enzymatic second-strand synthesis.

[0113] Alternatively, hexaethylene glycol is used to covalently linknucleic acid molecules to nylon or other support matrices (Adams andKron, 1994, U.S. Pat. No. 5,641,658). In addition, nucleic acidmolecules are crosslinked to nylon via irradiation with ultravioletlight. While the length of time for which a support is irradiated aswell as the optimal distance from the ultraviolet source is calibratedwith each instrument used, due to variations in wavelength andtransmission strength, at least one irradiation device designedspecifically for crosslinking of nucleic acid molecules to hybridizationmembranes is commercially available (Stratalinker; Stratagene). Itshould be noted that in the process of crosslinking via irradiation,limited nicking of nucleic acid strand occurs; however, the amount ofnicking is generally negligible under conditions such as those used inhybridization procedures. Attachment of nucleic acid molecules to thesupport at positions that are neither 5′-nor 3′-terminal also occurs,but it should be noted that the potential for utility of an array socrosslinked is largely uncompromised, as such crosslinking does notinhibit hybridization of oligonucleotide primers to the immobilizedmolecule where it is bonded to the support. The production of ‘terminal’copies of an array of the invention, i.e. those that will not serve astemplates for further replication, is not affected by the method ofcrosslinking; however, in situations in which sites of covalent linkageare, preferably, at the termini of molecules of the array, crosslinkingmethods other than ultraviolet irradiation are employed.

[0114] Step 3. Amplification of the Nucleic Acid Molecules of the Array

[0115] The molecules are amplified in situ (Tsongalis et al., 1994,Clinical Chemistry, 40: 381-384; see also review by Long and Komminoth,1997, Methods Mol. Biol., 71: 141-161) by standard molecular techniques,such as thermal-cycled PCR (Mullis and Faloona, 1987, Methods Enzymol.,155: 335-350) or isothermal 3SR (Gingeras et al., 1990, Annales deBiologie Clinique, 48(7): 498-501; Guatelli et al., 1990, Proc. Natl.Acad. Sci, U.S.A., 87: 1874). Another method of nucleic acidamplification that is of use according to the invention is the DNAligase amplification reaction (LAR), which has been described aspermitting the exponential increase of specific short sequences throughthe activities of any one of several bacterial DNA ligases (Wu andWallace, 1989, Genomics, 4: 560). The contents of this article areherein incorporated by reference.

[0116] The polymerase chain reaction (PCR), which uses multiple cyclesof DNA replication catalyzed by a thermostable, DNA-dependent DNApolymerase to amplify the target sequence of interest, is well known inthe art, and is presented in detail in the Examples below. The secondamplification process, 3SR, is an outgrowth of the transcription-basedamplification system (TAS), which capitalizes on the high promotersequence specificity and reiterative properties of bacteriophageDNA-dependent RNA polymerases to decrease the number of amplificationcycles necessary to achieve high amplification levels (Kwoh et al.,1989, Proc. Natl. Acad. Sci, U.S.A., 83: 1173-1177). The 3SR methodcomprises an isothermal, Self-Sustained Sequence Replicationamplification reaction, is as follows:

[0117] Each priming oligonucleotide contains the T7 RNA polymerasebinding sequence (TAATACGACTCACTATA [SEQ ID NO: 1]) and the preferredtranscriptional initiation site. The remaining sequence of each primeris complementary to the target sequence on the molecule to be amplified.

[0118] The 3SR amplification reaction is carried out in 100 μl andcontains the target RNA, 40 mM Tris-HCl, ph 8.1, 20 mM MgCl2, 2 mMspermidine-HCl, 5 mM dithiothreitol, 80 μg/ml BSA, 1 mM dATP, 1 mM dGTP,1 mM dTTP, 4 mMATP, 4 mM CTP, 1 mM GTP, 4 mM dTTP, 4 mM ATP, 4 mM CTP, 4mM GTP, 4 mMUTP, and a suitable amount of oligonucleotide primer (250 ngof a 57-mer; this amount is scaled up or down, proportionally, dependingupon the length of the primer sequence). Three to 6 attomoles of thenucleic acid target for the 3SR reactions is used. As a control forbackground, a 3SR reaction without any target (H₂O) is run. The reactionmixture is heated to 100° C. for 1 minute, and then rapidly chilled to42° C. After 1 minute, 10 units (usually in a volume of approximately 2μl) of reverse transcriptase, (e.g. avian myoblastosis virus reversetranscriptase, AMV-RT; Life Technologies/Gibco-BRL) is added. Thereaction is incubated for 10 minutes, at 42° C. and then heated to 100°C. for 1 minute. (If a 3SR reaction is performed using a single-strandedtemplate, the reaction mixture is heated instead to 65 ° C. for 1minute.) Reactions are then cooled to 37° C. for 2 minutes prior to theaddition of 4.6 μl of a 3SR enzyme mix, which contains 1.6 μl of AMV-RTat 18.5 units/μl, 1.0 μl T7 RNA polymerase (both e.g. from Stratagene;La Jolla, Calif.) at 100 units/μl and 2.0 μl E. Coli RNase H at 4units/μl (e.g. from Gibco/Life Technologies; Gaithersburg, Md.). It iswell within the knowledge of one of skill in the art to adjust enzymevolumes as needed to account for variations in the specific activitiesof enzymes drawn from different production lots or supplied by differentmanufacturers. The reaction is incubated at 37° C. for 1 hour andstopped by freezing. While the handling of reagents varies depending onthe physical size of the array (which planar surface, if large, requirescontainment such as a tray or thermal-resistant hybridization bag ratherthan a tube), this method is of use to amplify the molecules of an arrayaccording to the invention.

[0119] Other methods which are of use in the amplification of moleculesof the array include, but are not limited to, nucleic acidsequence-based amplification (NASBA; Compton, 1991, Nature, 350: 91-92,incorporated herein by reference) and strand-displacement amplification(SDA; Walker et al., 1992, Nucleic Acids Res., 20: 1691-1696,incorporated herein by reference).

[0120] Step 4. Replication of the Array

[0121] a. The master plate generated in steps 1 through 3 isreplica-plated by any of a number of methods (reviewed by Lederberg,1989, Genetics, 121(3): 395-9) onto similar gel-chips. This replica isperformed by directly contacting the compressible surfaces of the twogels face to face with sufficient pressure that a few molecules of eachclone are transferred from the master to the replica. Such contact isbrief, on the order of 1 second to 2 minutes. This is done foradditional replicas from the same master, limited only by the number ofmolecules post-amplification available for transfer divided by theminimum number of molecules that must be transferred to achieve anacceptably faithful copy. While it is theoretically possible to transferas little as a single molecule per feature, a more conservative approachis taken. The number of each species of molecule available for transfernever approaches a value so low as to raise concern about theprobability of feature loss or to the point at which a base substitutionduring replication of one member of a feature could, in subsequentrounds of amplification, create a significant (detectable) population ofmutated molecules that might be mistaken for the unaltered sequence,unless errors of those types are within the limits of tolerance for theapplication for which the array is intended. Note that differentialreplicative efficiencies of the molecules of the array are not as greata concern as they would be in in the case of amplification of aconventional library, such as a phage library, in solution or on anon-covalently-bound array. Because of the physical limitations ondiffusion of molecules of any feature, one which is efficientlyamplified cannot ‘overgrow’ one which is copied less efficiently,although the density of complete molecules of the latter on the arraymay be low. It is estimated that 10 to 100 molecules per feature aresufficient to achieve fidelity during the printing process. Typically,at least 100 to 1000 molecules are transferred.

[0122] Alternatively, the plated DNA is reproduced inexpensively bymicrocontact printing, or μCP, (Jackman et al, 1995, Science, 269(5224):664-666, 1995) onto a surface with an initially uniform (or patterned)coating of two oligonucleotides (one or both immobilized by their 5′ends) suitable for in situ amplification. Pattern elements aretransferred from an elastomeric support (comparable in its physicalproperties to support materials that are useful according to theinvention) to a rigid, curved object that is rolled over it; if desired,a further, secondary transfer of the pattern elements from the rigidcylinder or other object onto a support is performed. The surface of oneor both is compliant to achieve uniform contact. For example, 30 micronthin polyacrylamide films are used for immobilizing oligomers covalentlyas well as for in situ hybridizations (Khrapko, et al., 1991, DNASequence, 1(6):375-88). Effective contact printing is achieved with thetransfer of very few molecules of double- or single-stranded DNA fromeach sub-feature to the corresponding point on the recipient support.

[0123] b. The replicas are then amplified as in step 3.

[0124] c. Alternatively, a replica serves as a master for subsequentsteps like step 4, limited by the diffusion of the features and thedesired feature resolution.

[0125] Step 5. Identification of Features of the Array

[0126] Ideally, feature identification is performed on the first arrayof a set produced by the methods described above; however, it is alsodone using any array of a set, regardless of its position in the line ofproduction. The features are sequenced by hybridization to fluorescentlylabeled oligomers representing all sequences of a certain length (.e.g.all 4096 hexamers) as described for Sequencing-by-Hybridization (SBH,also called Sequencing-by-Hybridization-to-an-Oligonucleotide-Matrix, orSHOM; Drmanac et al., 1993, Science, 260(5114):1649-52; Khrapko, et al.1991, supra; Mugasimangalam et al., 1997, Nucleic Acids Res., 25:800-805). The sequencing in step 5 is considerably easier thanconventional SBH if the feature lengths are short (e.g. ss-25-mersrather than the greater than ds-300-mers used in SBH), if the genomesequence is known or if a preselection of features is used.

[0127] SBH involves a strategy of overlapping block reading. It is basedon hybridization of DNA with the complete set of immobilizedoligonucleotides of a certain length fixed in specific positions on asupport. The efficiency of SBH depends on the ability to sort outeffectively perfect duplexes from those that are imperfect (i.e. containbase pair mismatches). This is achieved by comparing thetemperature-dependent dissociation curves of the duplexes formed by DNAand each of the immobilized oligonucleotides with standard dissociationcurves for perfect oligonucleotide duplexes.

[0128] To generate a hybridization and dissociation curve, a ³²P-labeledDNA fragment (30,000 cpm, 30 fmoles) in 1 μl of hybridization buffer (1MNaCl; 10 mM Na phosphate, pH 7.0; 0.5mM EDTA) is pipetted onto a dryplate so as to cover a dot of an immobilized oligonucleotide.Hybridization is performed for 30 minutes at 0° C. The support is rinsedwith 20 ml of hybridization buffer at 0° C. and then washed 10 timeswith the same buffer, each wash being performed for 1 minute at atemperature 5° C. higher than the previous one. The remainingradioactivity is measured after each wash with a minimonitor (e.g. aMini monitor 125; Victoreen) additionally equipped with a countintegrator, through a 5mm aperture in a lead screen. The remainingradioactivity (% of input) is plotted on a logarithmic scale againstwash temperature.

[0129] For hybridization with a fluorescently-labeled probe, a volume ofhybridization solution sufficient to cover the array is used, containingthe probe fragment at a concentration of 2 fmoles/0.01 μl. Thehybridization incubated for 5.0 hour at 17° C. and then washed at 0° C.,also in hybridization buffer. Hybridized signal is observed andphotographed with a fluorescence microscope (e.g. Leitz “Aristoplan”;input filter 510-560nm, output filter 580 nm) equipped with aphotocamera. Using 250 ASA film, an exposure of approximately 3 minutesis taken.

[0130] For SBH, one suitable immobilization support is a 30 μm-thickpolyacrylamide gel covalently attached to glass. Oligonucleotides to beused as probes in this procedure are chemically synthesized (e.g. by thesolid-support phosphoramidite method, deprotected in ammonium hydroxidefor 12 h at 55° C. and purified by PAGE under denaturing conditions).Prior to use, primers are labeled either at the 5 ′-end with [γ-³²P]ATP,using T4 polynucleotide kinase, to a specific activity of about 1000cpm/fmol, or at the 3′-end with a fluorescent label, e.g.tetramethylrhodamine (TMR), coupled to dUTP through the base by terminaltransferase (Aleksandrova et al., 1990, Molek. Biologia [Moscow], 24:1100-1108) and further purified by PAGE.

[0131] An alternative method of sequencing involves subsequent rounds ofstepwise ligation and cleavage of a labeled probe to a targetpolynucleotide whose sequence is to be determined (Brenner, U.S. Pat.No. 5,599,675). According to this method, the nucleic acid to besequenced is prepared as a double-stranded DNA molecule with a “stickyend”, in other words, a single-stranded terminal overhang, whichoverhang is of a known length that is uniform among the molecules of thepreparation, typically 4 to 6 bases. These molecules are then probed inorder to determine the identity of a particular base present in thesingle-stranded region, typically the terminal base. A probe of use inthis method is a double-stranded polynucleotide which (i) contains arecognition site for a nuclease, and (ii) typically has a protrudingstrand capable of forming a duplex with a complementary protrudingstrand of the target polynucleotide. In each sequencing cycle, onlythose probes whose protruding strands form perfectly-matched duplexeswith the protruding strand of the target polynucleotide hybridize- andare then ligated to the end of the target polynucleotide. The probemolecules are divided into four populations, wherein each suchpopulation comprises one of the four possible nucleotides at theposition to be determined, each labeled with a distinct fluorescent dye.The remaining positions of the duplex-forming region are occupied withrandomized, unlabeled bases, so that every possible multimer the lengthof that region is represented; therefore, a certain percentage of probemolecules in each pool are complementary to the single-stranded regionof the target polynucleotide; however, only one pool bears labeled probemolecules that will hybridize.

[0132] After removal of the unligated probe, a nuclease recognizing theprobe cuts the ligated complex at a site one or more nucleotides fromthe ligation site along the target polynucleotide leaving an end,usually a protruding strand, capable of participating in the next cycleof ligation and cleavage. An important feature of the nuclease is thatits recognition site be separate from its cleavage site. In the courseof such cycles of ligation and cleavage, the terminal nucleotides of thetarget polynucleotide are identified. As stated above, one such categoryof enzyme is that of type IIs restriction enzymes, which cleave sites upto 20 base pairs remote from their recognition sites; it is contemplatedthat such enzymes may exist which cleave at distances of up to 30 basepairs from their recognition sites.

[0133] Ideally, it is the terminal base whose identity is beingdetermined (in which it is the base closest to the double-strandedregion of the probe which is labeled), and only this base is cleavedaway by the type IIs enzyme. The cleaved probe molecules are recovered(e.g. by hybridization to a complementary sequence immobilized on a beador other support matrix) and their fluorescent emission spectrummeasured using a fluorimeter or other light-gathering device. Note thatfluorimetric analysis may be made prior to cleavage of the probe fromthe test molecule; however, cleavage prior to qualitative analysis offluorescence allows the next round of sequencing to commence whiledetermination of the identity of the first sequenced base is inprogress. Detection prior to cleavage is preferred where sequencing iscarried out in parallel on a plurality of sequences (either segments ofa single target polynucleotide or a plurality of altogether differenttarget polynucleotides), e.g. attached to separate magnetic beads, orother types of solid phase supports, such as the replicable arrays ofthe invention. Note that whenever natural protein endonucleases areemployed as the nuclease, the method further includes a step ofmethylating the target polynucleotide at the start of a sequencingoperation to prevent spurious cleavages at internal recognition sitesfortuitously located in the target polynucleotide.

[0134] By this method, there is no requirement for the electrophoreticseparation of closely-sized DNA fragments, for difficult-to-automategel-based separations, or the generation of nested deletions of thetarget polynucleotide. In addition, detection and analysis are greatlysimplified because signal-to noise ratios are much more favorable on anucleotide-by-nucleotide basis, permitting smaller sample sizes to beemployed. For fluorescent-based detection schemes, analysis is furthersimplified because fluorophores labeling different nucleotides may beseparately detected in homogeneous solutions rather than in spatiallyoverlapping bands.

[0135] As alluded to, the target polynucleotide may be anchored to asolid-phase support, such as a magnetic particle, polymeric microsphere,filter material, or the like, which permits the sequential applicationof reagents without complicated and time-consuming purification steps.The length of the target polynucleotide can vary widely; however, forconvenience of preparation, lengths employed in conventional sequencingare preferred. For example, lengths in the range of a few hundredbasepairs, 200-300, to 1 to 2 kilobase pairs are most often used.

[0136] Probes of use in the procedure may be labeled in a variety ofways, including the direct or indirect attachment of radioactivemoieties, fluorescent moieties, colorimetric moieties, and the like.Many comprehensive reviews of methodologies for labeling DNA andconstructing DNA probes provide guidance applicable to constructingprobes (see Matthews et al., 1988, Anal. Biochem., 169: 1-25; Haugland,1992, Handbook of Fluorescent Probes and Research Chemicals, MolecularProbes, Inc., Eugene, Oreg.; Keller and Manak, 1993, DNA Probes, 2ndEd., Stockton Press, New York; Eckstein, ed., 1991, Oligonucleotides andAnalogues: A Practical Approach, ML Press, Oxford, 1991); Wetmur, 1991,Critical Reviews in Biochemistry and Molecular Biology, 26: 227-259).Many more particular labelling methodologies are known in the art (seeConnolly, 1987, Nucleic Acids Res., 15: 3131-3139; Gibson et al. 1987,Nucleic Acids Res., 15: 5455-6467; Spoat et al., 1987, Nucleic AcidsRes., 15: 4837-4848; Fung et al., U.S. Pat. No. 4,757,141; Hobbs, etal., U.S. Pat. No. 5,151,507; Cruickshank, U.S. Pat. No. 5,091,519;[synthesis of functionalized oligonucleotides for attachment of reportergroups]; Jablonski et al., 1986, Nucleic Acids Res., 14: 6115-6128[enzyme/oligonucleotide conjugates]; and Urdea et al., U.S. Pat. No.5,124,246 [branched DNA]). The choice of attachment sites of labelingmoieties does not significantly affect the ability of a given labeledprobe to identify nucleotides in the target polynucleotide, providedthat such labels do not interfere with the ligation and cleavage steps.In particular, dyes may be conveniently attached to the end of the probedistal to the target polynucleotide on either the 3′ or 5′ termini ofstrands making up the probe, e.g. Eckstein (cited above), Fung (citedabove), and the like. In some cases, attaching labeling moieties tointerior bases or inter-nucleoside linkages may be desirable.

[0137] As stated above, four sets of mixed probes are provided foraddition to the target polynucleotide, where each is labeled with adistinguishable label. Typically, the probes are labeled with one ormore fluorescent dyes, e.g. as disclosed by Menchen et al, U.S. Pat No.5,188,934; Begot et al PCT application PCT/US90/05565. Each of fourspectrally resolvable fluorescent labels may be attached, for example,by way of Aminolinker II (all available from Applied Biosystems, Inc.,Foster City, Calif.); these include TAMRA (tetramethylrhodamine), FAM(fluorescein), ROX (rhodamine X), and JOE (2′, 7′-dimethoxy-4′,5′-dichlorofluorescein) and their attachment to oligonucleotides isdescribed in Fung et al., U.S. Pat. No. 4,855,225.

[0138] Typically, nucleases employed in the invention are naturalprotein endonucleases (i) whose recognition site is separate from itscleavage site and (ii) whose cleavage results in a protruding strand onthe target polynucleotide. Class IIS restriction endonucleases that maybe employed are as previously described (Szybalski et al., 1991, Gene,100: 13-26; Roberts et al., 1993, Nucleic Acids Res., 21: 3125-3137;Livak and Brenner, U.S. Pat No. 5,093,245). Exemplary class IIsnucleases include AlwXI, BsmAI, BbvI, BsmFI, SisI, HgaI, BscAl, BbvII,BcefI, Bce85I, BccI, BcgI, BsaI, BsgI, BspMI, Bst71I, Ear1, Eco57I,Esp3I, FauI, FokI, GsuI, HphI, MboII, MmeI, RleAI, SapI, SfaNI, TaqII,Tth111, Bco5I, BpuAI, FinI, BsrDI, and isoschizomers thereof. Preferrednucleases include Fok1, HgaI, EarI, and SfaNI. Reactions are generallycarried out in 50 μL volumes of manufacturer's (New England Biolabs)recommended buffers for the enzymes employed, unless otherwiseindicated. Standard buffers are also described in Sambrook et al., 1989,supra.

[0139] When conventional ligases are employed, the 5′ end of the probemay be phosphorylated. A 5′ monophosphate can be attached to a secondoligonucleotide either chemically or enzymatically with a kinase (seeSambrook et al., 1989, supra). Chemical phosphorylation is described byHorn and Urdea, 1986, Tetrahedron Lett., 27: 4705, and reagents forcarrying out the disclosed protocols are commercially available (e.g. 51Phosphate-ONTm from Clontech Laboratories; Palo Alto, Calif.).

[0140] Chemical ligation methods are well known in the art, e.g. Ferriset al., 1989, Nucleosides & Nucleotides, 8: 407-414; Shabarova et al.,1991, Nucleic Acids Res., 19: 4247-4251. Typically, ligation is carriedout enzymatically using a ligase in a standard protocol. Many ligasesare known and are suitable for use in the invention (Lehman, 1974,Science, 186: 790-797; Engler et al., 1982, “DNA Ligases”, in Boyer,ed., The Enzymes, Vol. 15B pp. 3-30, Academic Press, New York).Preferred ligases include T4 DNA ligase, T7 DNA ligase, E. coli DNAligase, Taq ligase, Pfu ligase and Tth ligase. Protocols for their useare well known, (e.g. Sambrook et al., 1989, supra; Barany, 1991, PCRMethods and Applications, 1: 5-16; Marsh et al., 1992, Strategies, 5:73-76). Generally, ligases require that a 5′ phosphate group be presentfor ligation to the 3′ hydroxyl of an abutting strand. This isconveniently provided for at least one strand of the targetpolynucleotide by selecting a nuclease which leaves a 5′ phosphate, e.g.FokI.

[0141] Prior to nuclease cleavage steps, usually at the start of asequencing operation, the target polynucleotide is treated to block therecognition sites and/or cleavage sites of the nuclease being employed.This prevents undesired cleavage of the target polynucleotide because ofthe fortuitous occurrence of nuclease recognition sites at interiorlocations in the target polynucleotide. Blocking can be achieved in avariety of ways, including methylation and treatment bysequence-specific aptamers, DNA binding proteins, or oligonucleotidesthat form triplexes. Whenever natural protein endonucleases areemployed, recognition sites can be conveniently blocked by methylatingthe target polynucleotide with the so-called “cognate” methylase of thenuclease being used; for most (if not all) type II bacterial restrictionendonucleases, there exist cognate methylases that methylate theircorresponding recognition sites. Many such methylases are known in theart (Roberts et al., 1993, supra; Nelson et al., 1993, Nucleic AcidsRes., 21: 3139-3154) and are commercially available from a variety ofsources, particularly New England Biolabs (Beverly, Mass.).

[0142] The method includes an optional capping step after the unligatedprobe is washed from the target polynucleotide. In a capping step, byanalogy with polynucleotide synthesis (e.g. Andrus et al., U.S. Pat. No.4,816,571), target polynucleotides that have not undergone ligation to aprobe are rendered inert to further ligation steps in subsequent cycles.In this manner spurious signals from “out of phase” cleavages areprevented. When a nuclease leaves a 5′ protruding strand on the targetpolynucleotides, capping is usually accomplished by exposing theunreacted target polynucleotides to a mixture of the fourdideoxynucleoside triphosphates, or other chain-terminating nucleosidetriphosphates, and a DNA polymerase. The DNA polymerase extends the Ystrand of the unreacted target polynucleotide by one chain-terminatingnucleotide, e.g. a dideoxynucleotide, thereby rendering it incapable ofligating with probe in subsequent cycles.

[0143] Alternatively, a simple method involving quantitative incrementalfluorescent nucleotide addition sequencing (QIFNAS), is employed inwhich each end of each clonal oligonucleotide is sequenced by primerextension with a nucleic acid polymerase (e.g. Klenow or Sequenase™;U.S. Biochemicals) and one nucleotide at a time which has a traceablelevel of the corresponding fluorescent dNTP or rNTP, for example, 100micromolar dCTP and 1 micromolar fluorescein-dCTP. This is donesequentially, e.g. dATP, dCTP, dGTP, dTTP, dATP and so forth until theincremental change in fluorescence is below a percentage that isadequate for useful discrimination from the cumulative total fromprevious cycles. The length of the sequence so determined may beextended by any of periodic photobleaching or cleavage of theaccumulated fluorescent label from nascent nucleic acid molecules ordenaturing the nascent nucleic acid strands from the array andre-priming the synthesis using sequence already obtained.

[0144] After features are identified on a first array of the set, it isdesirable to provide landmarks by which subsequently-produced arrays ofthe set are aligned with it, thereby enabling workers to locate on themfeatures of interest. This is important, as the first array of a setproduced by the method of the invention is, by nature, random, in thatthe nucleic acid molecules of the starting pool are not placed down in aspecific or pre-ordered pattern based upon knowledge of their sequences.

[0145] Several types of markings are made according to the technologyavailable in the art. For instance, selected features are removed bylaser ablation (Matsuda and Chung 1994, ASAIO Journal, 40(3): M594-7;Jay, 1988, Proc. Natl. Acad. Sci. U.S.A., 85: 5454-5458; Kimble, 1981,Dev. Biol., 87(2): 286-300) or selectively replicated on copies of anarray by laser-enhanced adhesion (Emmert-Buck et al, 1996, Science,274(5289): 998-1001). These methods are used to eliminate nucleic acidfeatures that interfere with adjacent features or to create a patternthat is easier for software to align.

[0146] Laser ablation is carried out as follows: A KrF excimer laser,e.g. a Hamamatsu L4500 (Hamamatsu, Japan) (pulse wavelength, 248nm;pulse width, 20ns) is used as the light source. The laser beam isconverged through a laser-grade UV quartz condenser lens to yieldmaximum fluences of 3.08 ┘/cm² per pulse. Ablation of the matrix andunderlying glass surface is achieved by this method. The depth ofetching into the glass surfaces is determined using real-time scanninglaser microscopy (Lasertec 1 LM21W, Yokohama, Japan), and a depthprofile is determined.

[0147] Selective transfer of features via laser-capture microdissectionproceeds as follows: A flat film (100 μm thick) is made by spreading amolten thermoplastic material e.g. ethylene vinyl acetate polymer (EVA;Adhesive Technologies; Hampton, N.H.) on a smooth silicone orpolytetrafluoroethylene surface. The optically-transparent thin film isplaced on top of an array of the invention, and the array/film sandwichis viewed in an inverted microscope (e.g. and Olympus Model CK2; Tokyo)at 100×magnification (10×objective). A pulsed carbon dioxide laser beamis introduced by way of a small front-surface mirror coaxial with thecondenser optical path, so as to irradiate the upper surface of the EVAfilm. The carbon dioxide laser (either Apollo Company model 580, LosAngeles, or California Laser Company model LS150, San Marcos, Calif.)provides individual energy pulses of adjustable length and power. A ZnSelens focuses the laser beam to a target of adjustable spot size on thearray. For transfer spots of 150 μm diameter, a 600-microsecond pulsedelivers 25-30 mW to the film. The power is decreased or increasedapproximately in proportion to the diameter of the laser spot focused onthe array. The absorption coefficient of the EVA film, measured byFourier transmission, is 200 cm¹ at a laser wavelength of 10.6 μm.Because >90% of the laser radiation is absorbed within the thermoplasticfilm, little direct heating occurs. The glass plate or chip upon whichthe semi-solid support has been deposited provides a heat sink thatconfines the full-thickness transient focal melting of the thermoplasticmaterial to the targeted region of the array. The focally-molten plasticmoistens the targeted tissue. After cooling and recrystallization, thefilm forms a local surface bond to the targeted nucleic acid moleculesthat is stronger than the adhesion forces that mediate their affinityfor the semi-solid support medium. The film and targeted nucleic acidsare removed from the array, resulting in focal microtransfer of thetargeted nucleic acids to the film surface.

[0148] If removal of molecules from the array by this method isperformed for the purpose of ablation, the procedure is complete. Ifdesired, these molecules instead are amplified and cloned out, asdescribed in Example 7.

[0149] A method provided by the invention for the easy orientation ofthe nucleic acid molecules of a set of arrays relative to one another is“array templating”. A homogeneous solution of an initial library ofsingle-stranded DNA molecules is spread over a photolithographicall-10-mer ss-DNA oligomer array under conditions which allow sequencescomprised by library members to become hybridized to member molecules ofthe array, forming an arrayed library where the coordinates are in orderof sequence as defined by the array. For example, a 3′-immobilized10-mer (upper strand), binds a 25-mer library member (lower strand) asshown below:                5′-TGCATGCTAT-3′ [SEQ ID NO: 2]3′-CGATGCATTTACGTAACGTACGATA-5′ [SEQ ID NO: 3]

[0150] Covalent linkage of the 25-mer sequence to the support,amplification and replica printing are performed by any of the methodsdescribed above. Further characterization, if required, is carried outby SBH, fluorescent dNTP extension or any other sequencing methodapplicable to nucleic acid arrays, such as are known in the art. Thisgreatly enhances the ability to identify the sequence of a sufficientnumber of oligomer features in the replicated array to make the arrayuseful in subsequent applications.

EXAMPLE 2

[0151] Ordered Chromosomal Arrays According to the Invention

[0152] Direct in situ single-copy (DISC)-PCR is a method that uses twoprimers that define unique sequences for on-slide PCR directly onmetaphase chromosomes (Troyer et al., 1994a, Mammalian Genome , 5:112-114; summarized by Troyer et al., 1997, Methods Mol. 71: PRINS andIn Situ PCR Protocols, J. R. Godsen, ed., Humana Press, Inc., Totowa,N.J., pp. 71-76). It thus allows exponential accumulation of PCR productat specific sites, and so may be adapted for use according to theinvention.

[0153] The DISC-PCR procedure has been used to localize sequences asshort as 100-300bp to mammalian chromosomes (Troyer et al., 1994a,supra; Troyer et al., 1994b, Cytogenet, Cell Genetics, 67(3), 199-204;Troyer et al., 1995, Anim. Biotechnology, 6(1): 51-58; and Xie et al.,1995, Mammalian Genome 6: 139-141). It is particularly suited forphysically assigning sequence tagged sites (STSs), such asmicrosatellites (Litt and Luty, 1989, Am. J. Hum. Genet, 44: 397-401;Weber and May, 1989, Am. J. Hum. Genet 44, 338-396), many of whichcannot be assigned by in situ hybridization because they have beenisolated from small-insert libraries for rapid sequencing. It can alsobe utilized to map expressed sequence tags (ESTs) physically (Troyer,1994a, supra; Schmutz et al., 1996, Cytogenet. Cell Genetics, 72:37-39). DISC-PCR obviates the necessity for an investigator to have acloned gene in hand, since all that is necessary is to have enoughsequence information to synthesize PCR primers. By the methods of theinvention, target-specific primers need not even be utilized; all thatis required is a mixed pool of primers whose members have at one end a‘universal’ sequence, suitable for manipulations such as restrictionendonuclease cleavage or hybridization to oligonucleotide moleculesimmobilized on- or added to a semi-solid support and, at the other end,an assortment of random sequences (for example, every possible hexamer)which will prime in situ amplification of the chromosome. As describedabove, the primers may include terminal crosslinking groups with whichthey may be attached to the semi-solid support of the array followingtransfer; alternatively, they may lack such an element, and beimmobilized to the support either through ultraviolet crosslinking orthrough hybridization to complementary, immobilized primers andsubsequent primer extension, such that the newly-synthesized strandbecomes permanently bound to the array. The DISC-PCR procedure issummarized briefly as follows:

[0154] Metaphase chromosomes anchored to glass slides are prepared bystandard techniques (Halnan, 1989, in Cytogenetics of Animals, C. R. E.Halnan, ed., CAB International, Wallingford, U.K., pp. 451-456; ), usingslides that have been pre-rinsed in ethanol and dried using lint-freegauze. Slides bearing chromosome spreads are washed inphosphate-buffered saline (PBS; 8.0 g NaCl, 1.3 g Na₂HPO₄ and 4 gNaH₂PO₄ dissolved in deionized water, adjusted to a volume of 1 literand pH of 7.4) for 10 min and dehydrated through an ethanol series (70-,80-, 95-, and 100%). Note that in some cases, overnight fixation ofchromosomes in neutral-buffered formalin followed by digestion for 15minutes with pepsinogen (2 mg/ml; Sigma) improves amplificationefficiency.

[0155] For each slide, the following solution is prepared in a microfugetube: 200 μM each dATP, dCTP, dGTP and dTTP; all deoxynucleotides aremaintained as frozen, buffered 10 mM stock solutions or in dry form, andmay be obtained either in dry or in solution from numerous suppliers(e.g. Perkin Elmer, Norwalk, Conn.; Sigma, St. Louis, Mo.; Pharmacia,Uppsala, Sweden). The reaction mixture for each slide includes 1.5 μMeach primer (from 20 μM stocks), 2.0 μL 10×Taq polymerase buffer (100 mMTris-HCl, pH 8.3, 500 mM KCl, 15 mM MgCl₂, 0.1% BSA; Perkin Elmer), 2.5units AmpliTaq polymerase (Perkin Elmer) and deionzed H₂O to a finalvolume of 20 μl. Note that the commercially supplied Taq polymerasebuffer is normally adequate; however, adjustments may be made as neededin [MgCl₂] or pH, in which case an optimization kit, such as theOpti-Primer PCR Kit (Stratagene; La Jolla, Calif.) may be used. Theabove reaction mixture is pipetted onto the metaphase chromosomes andcovered with a 22×50 mm coverslip, the perimeter of which is then sealedwith clear nail polish. All air bubbles, even the smallest, are removedprior to sealing, as they expand when heated, and will inhibit thereaction. A particularly preferred polish is Hard As Nails (SallyHansen); this nail enamel has been found to be resistant to leakage,which, if it occurred, would also compromise the integrity of thereaction conditions and inhibit amplification of the chromosomal DNAsequences. One heavy coat is sufficient. After the polish has beenallowed to dry at room temperature, the edges of the slide are coveredwith silicone grease (Dow Corning Corporation, Midland, Mich.). Slidesare processed in a suitable thermal cycler (i.e. one designed foron-slide PCR, such as the BioOven III; Biotherm Corp., Fairfax, Va.)using the following profile:

[0156] a. 94° C. for 3 min.

[0157] b. Annealing temperature of primers for I min.

[0158] c. 72° C. for 1 min.

[0159] d. 92° C. for 1 min.

[0160] e. Cycle to step b 24 more times (25 cycles total).

[0161] f. Final extension step of 3-5 min.

[0162] After thermal cycling is complete, silicone grease is removedwith a tissue, and the slide is immersed in 100% ethanol. Using a sharprazor blade, the nail polish is cut through and the edge of thecoverslip is lifted gently and removed. It is critical that the slidenever be allowed to dry from this point on, although excess buffer isblotted gently off of the slide edge. The slide is immersed quickly in4×SSC and excess nail polish is scraped from the edges of the slideprior to subsequent use.

[0163] The slide is contacted immediately with a semi-solid support inorder to transfer to it the amplified nucleic acid molecules;alternatively, that the slide is first equilibrated in a liquid mediumthat is isotonic with- or, ideally, identical to that which permeates(i.e. is present in the pores of-) the semi-solid support matrix. Fromthat point on, the array is handled comparably with those preparedaccording to the methods presented in Example 1. Feature identification,also as described above, permits determination of the approximatepositions of genetic elements along the length of the templatechromosome. In preparations in which chromosomes are linearly extended(stretched), the accuracy of gene ordering is enhanced. This isparticularly useful in instances in which such information is not known,either through classical or molecular genetic studies, even in theextreme case of a chromosome that is entirely uncharacterized. By thismethod, comparative studies of homologous chromosomes between species ofinterest are performed, even if no previous genetic mapping has beenperformed on either. The information so gained is valuable in terms ofgauging the evolutionary relationships between species, in that bothlarge and small chromosomal rearrangements are revealed. The geneticbasis of phenotypic differences between different individuals of asingle species, e.g. human subjects, is also investigated by thismethod. When template chromosomes are condensed (coiled), moreinformation is gained regarding the in vivo spatial relationships amonggenetic elements. This may have implications in terms of cell-typespecific gene transcriptional activity, upon which comparison of arraysgenerated from samples comprising condensed chromosomes drawn from cellsof different tissues of the same organism may shed light.

[0164] While the methods by which histological samples are prepared, PCRis performed and the first copy of the chromosomal array is generatedare time-consuming, multiple copies of the array are produced easilyaccording to the invention, as described above in Example 1 andelsewhere. The ability of the invention to reproduce what would,otherwise, be a unique array provides a valuable tool by whichscientists have the power to work in parallel- or perform analyses ofdifferent types upon comparable samples. In addition, it allows for thegeneration of still more copies of the array for distribution to anynumber of other workers who may desire to confirm or extend any data setderived from such an array at any time.

[0165] A variation on this use of the present invention is chromosometemplating. DNA (e.g. that of a whole chromosome) is stretched out andfixed on a surface (Zimmermann and Cox, 1994, Nucleic Acids Rest, 22(3):492-497). Segments of such immobilized DNA are made single-stranded byexonucleases, chemical denaturants (e.g. formamide) and/or heat. Thesingle stranded regions are hybridized to the variable portions of anarray of single-stranded DNA molecules each bearing regions ofrandomized sequence, thereby forming an array where the coordinates offeatures correspond to their order on a linear extended chromosome.Alternatively, a less extended structure, which replicates the folded orpartially-unfolded state of various nucleic acid compartments in a cell,is made by using a condensed (coiled), rather than stretched,chromosome.

[0166] EXAMPLE 3

[0167] RNA Localization Arrays

[0168] The methods described in Example 2, above, are applied with equalsuccess to the generation of an array that provides a two-dimensionalrepresentation of the spatial distribution of the RNA molecules of acell. This method is applied to ‘squashed’ cellular material, preparedas per the chromosomal spreads described above in Example 2;alternatively, sectioned tissue samples affixed to glass surfaces areused. Either paraffin-, plastic- or frozen (Serrano et al., 1989, Dev.Biol.132: 410-418) sections are used in the latter case.

[0169] Tissue samples are fixed using conventional reagents; formalin,4% paraformaldehyde in an isotonic buffer, formaldehyde (each of whichconfers a measure of RNAase resistance to the nucleic acid molecules ofthe sample) or a multi-component fixative, such as FAAG (85% ethanol, 4%formaldehyde, 5% acetic acid, 1% EM grade glutaraldehyde) is adequatefor this procedure. Note that water used in the preparation of anyaqueous components of solutions to which the tissue is exposed until itis embedded is RNAase-free, i.e. treated with 0.1% diethylprocarbonate(DEPC) at room temperature overnight and subsequently autoclaved for 1.5to 2 hours. Tissue is fixed at 4° C., either on a sample roller or arocking platform, for 12 to 48 hours in order to allow fixative to reachthe center of the sample. Prior to embedding, samples are purged offixative and dehydrated; this is accomplished through a series of two-to ten-minute washes in increasingly high concentrations of ethanol,beginning at 60%- and ending with two washes in 95%- and another two in100% ethanol, followed two ten-minute washes in xylene. Samples areembedded in any of a variety of sectioning supports, e.g. paraffin,plastic polymers or a mixed paraffin/polymer medium (e.g. Paraplast®PlusTissue Embedding Medium, supplied by Oxford Labware). For example,fixed, dehydrated tissue is transferred from the second xylene wash toparaffin or a paraffin/polymer resin in the liquid-phase at about 58°C., then replace three to six times over a period of approximately threehours to dilute out residual xylene, followed by overnight incubation at58° C. under a vacuum, in order to optimize infiltration of theembedding medium in to the tissue. The next day, following several morechanges of medium at 20 minute to one hour intervals, also at 58 ° C.,the tissue sample is positioned in a sectioning mold, the mold issurrounded by ice water and the medium is allowed to harden. Sections of6 μm thickness are taken and affixed to ‘subbed’ slides, which are thosecoated with a proteinaceous substrate material, usually bovine serumalbumin (BSA), to promote adhesion. Other methods of fixation andembedding are also applicable for use according to the methods of theinvention; examples of these are found in Humason, G. L., 1979, AnimalTissue Techniques, 4both ed. (W.H. Freeman & Co., San Francisco), as isfrozen sectioning.

[0170] Following preparation of either squashed or sectioned tissue, theRNA molecules of the sample are reverse-transcribed in situ. In order tocontain the reaction on the slide, tissue sections are placed on a slidethermal cycler (e.g. Tempcycler II; COY Corp., Grass Lake, Mich.) withheating blocks designed to accommodate glass microscope slides.Stainless steel or glass (Bellco Glass Inc.; Vineland, N.J.) tissueculture cloning rings approximately 0.8 cm (inner diameter)×1.0 cm inheight are placed on top of the tissue section. Clear nail polish isused to seal the bottom of the ring to the tissue section, forming avessel for the reverse transcription and subsequent localized in situamplification (LISA) reaction (Tsongalis et al., 1994, supra).

[0171] Reverse transcription is carried out using reverse transcriptase,(e.g. avian myoblastosis virus reverse transcriptase, AMV-RT; LifeTechnologies/Gibco-BRL or Moloney Murine Leukemia Virus reversetranscriptase, M-MLV-RT, New England Biolabs, Beverly, Mass.) under themanufacturer's recommended reaction conditions. For example, the tissuesample is rehydrated in the reverse transcription reaction mix, minusenzyme, which contains 50 mM Tris-HCl (pH 8.3), 8 mM MgCl₂, 10 mMdithiothreitol, 1.0 mM each dATP, dTTP, dCTP and dGTP and 0.4 mMoligo-dT (12-to 18-mers). The tissue sample is, optionally, rehydratedin RNAase-free TE (10 mM Tris-HCl, pH 8.3 and 1 mM EDTA), then drainedthoroughly prior to addition of the reaction buffer. To denature the RNAmolecules, which may have formed some double-stranded secondarystructures, and to facilitate primer annealing, the slide is heated to65° C. for 1 minute, after which it is cooled rapidly to 37° C. After 2minutes, 500 units of M-MLV-RT are added the mixture, bringing the totalreaction volume to 100 μl. The reaction is incubated at 37° C. for onehour, with the reaction vessel covered by a microscope cover slip toprevent evaporation.

[0172] Following reverse transcription, reagents are pipetted out of thecontainment ring structure, which is rinsed thoroughly with TE buffer inpreparation for amplification of the resulting cDNA molecules.

[0173] The amplification reaction is performed in a total volume of 25μl, which consists of 75 ng of both the forward and reverse primers (forexample the mixed primer pools 1 and 2 of Example 6) and 0.6 U of Taqpolymerase in a reaction solution containing, per liter: 200 nmol ofeach deoxynucleotide triphosphate, 1.5 mmol of MgCl₂, 67 mmol ofTris-HCl (pH 8.8), 10 mmol of 2-mercaptoethanol, 16.6 mmol of ammoniumsulfate, 6.7 μmol of EDTA, and 10 μmol of digoxigenin-I11-dUTP. Thereaction mixture is added to the center of the cloning ring, and layeredover with mineral oil to prevent evaporation before slides are placedback onto the slide thermal cycler. DNA is denatured in situ at 94° C.for 2 min prior to amplification. LISA is accomplished by using 20cycles, each consisting of a 1-minute primer annealing step (55 ° C.), a1.5-min extension step (72° C.), and a 1-min denaturation step (94° C.).These amplification cycle profiles differ from those used in tubeamplification to preserve optimal tissue morphology, hence thedistribution of reverse transcripts and the products of theiramplification on the slide.

[0174] Following amplification, the oil layer and reaction mix areremoved from the tissue sample, which is then rinsed with xylene. Thecontainment ring is removed with acetone, and the tissue containing theamplified cDNA is rehydrated by washing three times in approximately 0.5ml of a buffer containing 100 mM Tris-Cl (pH 7.5) and 150 mM NaCl. Theimmobilized nucleic acid array of the invention is then formed bycontacting the amplified nucleic acid molecules with a semi-solidsupport and covalently crosslinking them to it, by any of the methodsdescribed above.

[0175] Features are identified using SBH, also as described above, andcorrelated with the positions of mRNA molecules in the cell.

EXAMPLE 4

[0176] Size-sorted Genomic Arrays

[0177] As mentioned above, it is possible to prepare a support matrix inwhich are embedded whole, even living, cells. Such protocols have beendeveloped for various purposes, such as encapsulated, implantablecell-based drug-delivery vehicles, and the delivery to an electophoreticmatrix of very large, unsheared DNA molecules, as required forpulsed-field gel electrophoresis (Schwartz and Cantor, 1984, Cell, 37:67-75). The arrays of the invention are constructed using as thestarting material genomic DNA from a cell of an organism that has beenembedded in an electrophoretic matrix and lysed in situ, such thatintact nucleic acid molecules are released into the support matrixenvironment. If an array based upon copies of large molecules is made,such as is of use in a fashion similar to the chromosomal elementordering arrays described above in Example 2, then a low-percentageagarose gel is used as a support. Following lysis (Schwartz and Cantor,1984, supra), the resulting large molecules may be size-sortedelectrophoretically prior to in situ PCR amplification and linkage tothe support, both as described above. If it is desired to preserve thearray on a support other than agarose, which may be difficult to handleif the gel is large, the array is transferred via electroblotting onto asecond support, such as a nylon or nitrocellulose membrane prior tolinkage.

[0178] If it is not considered essential to preserve the associationsbetween members of genetic linkage groups (at the coarsest level ofresolution, chromosomes), nucleic acid molecules are cleaved,mechanically, chemically or enzymatically, prior to electrophoresis. Amore even distribution of nucleic acid over the support results, andphysical separation of individual elements from one another is improved.In such a case, a polyacrylamide, rather than agarose, gel matrix isused as a support. The arrays produced by this method do, to a certainextent, resemble sequencing gels; cleavage of an electrophoresed array,e.g. with a second restriction enzyme or combination thereof, followedby electrophoresis in a second dimension improves resolution ofindividual nucleic acid sequences from one another.

[0179] Such an array is constructed to any desired size. It is nowfeasible to scan large gels (for example, 40 cm in length) at highresolution. In addition, advances in gel technology now permitsequencing to be performed on gels a mere 4 cm long, one tenth the usuallength, which demonstrates that a small gel is also useful according tothe invention.

EXAMPLE 5

[0180] Spray-painted Arrays (Inkjet)

[0181] Immobilized nucleic acid molecules may, if desired, be producedusing a device (e.g., any commercially-available inkjet printer, whichmay be used in substantially unmodified form) which sprays a focusedburst of nucleic acid synthesis compounds onto a support (seeCastellino, 1997, Genomc Res., 7: 943-976). Such a method is currentlyin practice at Incyte Pharmaceuticals and Rosetta Biosystems, Inc., thelatter of which employs what are said to be minimally-modified Epsoninkjet cartridges (Epson America, Inc.; Torrance, Calif.). The method ofinkjet deposition depends upon the piezoelectric effect, whereby anarrow tube containing a liquid of interest (in this case,oligonucleotide synthesis reagents) is encircled by an adapter. Anelectric charge sent across the adapter causes the adapter to expand ata different rate than the tube, and forces a small drop of liquidcontaining phosphoramidite chemistry reagents from the tube onto acoated slide or other support.

[0182] Reagents are deposited onto a discrete region of the support,such that each region forms a feature of the array; the desired nucleicacid sequence is synthesized drop-by-drop at each position, as is truein other methods known in the art. If the angle of dispersion ofreagents is narrow, it is possible to create an array comprising manyfeatures. Alternatively, if the spraying device is more broadly focused,such that it disperses nucleic acid synthesis reagents in a wider angle,as much as an entire support is covered each time, and an array isproduced in which each member has the same sequence (i.e. the array hasonly a single feature).

[0183] Arrays of both types are of use in the invention; a multi-featurearray produced by the inkjet method is used in array templating, asdescribed above; a random library of nucleic acid molecules are spreadupon such an array as a homogeneous solution comprising a mixed pool ofnucleic acid molecules, by contacting the array with a tissue samplecomprising nucleic acid molecules, or by contacting the array withanother array, such as a chromosomal array (Example 2) or an RNAlocalization array (Example 3).

[0184] Alternatively, a single-feature array produced by the inkjetmethod is used by the same methods to immobilize nucleic acid moleculesof a library which comprise a common sequence, whether anaturally-occurring sequence of interest (e.g. a regulatory motif) or anoligonucleotide primer sequence comprised by all or a subset of librarymembers, as described herein above and in Example 6, below.

[0185] Nucleic acid molecules which thereby are immobilized upon anordered inkjet array (whether such an array comprises one or a pluralityof oligonucleotide features) are amplified in situ, transferred to asemi-solid support and immobilized thereon to form a firstrandomly-patterned, immobilized nucleic acid array, which issubsequently used as a template with which to produce a set of sucharrays according to the invention, all as described above.

EXAMPLE 6

[0186] Isolation of a Feature from an Array of the Invention (Method1)/Heterologous Arrays

[0187] As described above in Example 1, sets of arrays are, if desired,produced according to the invention such that they incorporateoligonucleotide sequences bearing restriction sites linked to the endsof each feature. This provides a method for creating spatially-uniquearrays of primer pairs for in situ amplification, in which each featurehas a distinct set of primer pairs. One or both of the universal primerscomprises a restriction endonuclease recognition site, such as a typeIIS sequence (e.g. as Eco57I or MmeI which will cut up to 20 bp away).Treatment of the whole double-stranded array with the correspondingenzyme(s) followed by melting and washing away the non-immobilizedstrand creates the desired primer pairs with well-defined 3′ ends.Alternatively, a double-strand-specific 3′ exonuclease treatment of thedouble-stranded array is employed, but the resulting single-stranded 3′ends may vary in exact endpoint. The 3′ end of the primers are used forin situ amplification, for example of variant sequences in diagnostics.This method, by which arrays of unique primer pairs are producedefficiently, provides an advance over the method of Adams and Kron(1997, supra), in which each single pair of primers is manuallyconstructed and placed. Cloning of a given feature from an array of sucha set is performed as follows:

[0188] MmeI is a restriction endonuclease having the property ofcleaving at a site remote from its recognition site, TCCGAC.Heterogeneous pools of primers are constructed that comprise (from 5′ to3′) a sequence shared by all members of the pool, the MmeI recognitionsite, and a variable region. The variable region may comprise either afully-randomized sequence (e.g. all possible hexamers) or a selectedpool of sequences (e.g. variations on a particular protein-binding, orother, functional sequence motif). If the variable sequence is random,the length of the randomized sequence determines the sequence complexityof the pool. For example, randomization of a hexameric sequence at the3′ ends of the primers results in a pool comprising 4,096 distinctsequence combinations. Examples of two such mixed populations ofoligonucleotides (in this case, 32-mers) are primer pools 1s is and 2s,below: primer 1 (a pool of 4096 32-mers):       5′gcagcagtacgactagcataTCCGACnnnnnn 3′ [SEQ ID NO: 4] primer 2 (apool of 4096 32-mers):        5′cgatagcagtagcatgcaggTCCGACnnnnnn 3′ [SEQID NO: 5]

[0189] A nucleic acid preparation is amplified, using primer 1 torandomly prime synthesis of sequences present therein. The startingnucleic acid molecules are cDNA or genomic DNA, either of which maycomprise molecules that are substantially whole or that are into smallerpieces. Many DNA cleavage methods are well known in the art. Mechanicalcleavage is achieved by several methods, including sonication, repeatedpassage through a hypodermic needle, boiling or repeated rounds of rapidfreezing and thawing. Chemical cleavage is achieved by means whichinclude, but are not limited to, acid or base hydrolysis, or cleavage bybase-specific cleaving substances, such as are used in DNA sequencing(Maxam and Gilbert, 1977, Proc. Acad. Sci. U.S.A., 74: 560-564).Alternatively, enzymatic cleavage that is site-specific, such as ismediated by restriction endonucleases, or more general, such as ismediated by exo- and endonucleases e.g. ExoIII, mung bean nuclease,DNAase I or, under specific buffer conditions, DNA polymerases (such asT4), which chew back or internally cleave DNA in a proofreadingcapacity, is performed. If the starting nucleic acid molecules (whichmay, additionally, comprise RNA) are fragmented rather than whole(whether closed circular or chromosomal), so as to have free ends towhich a second sequence may be attached by means other than primedsynthesis, the MmeI recognition sites may be linked to the startingmolecules using DNA ligase, RNA ligase or terminal deoxynucleotidetransferase. Reaction conditions for these enzymes are as recommended bythe manufacturer (e.g. New England Biolabs; Beverly, Mass. or BoehringerMannheim Biochemicals, Indianapolis, Ind. If employed, PCR is performedusing template DNA (at least 1 fg; more usefully, 1-1,000 ng) and atleast 25 pmol of oligonucleotide primers; an upper limit on primerconcentration is set by aggregation at about 10 μg/ml. A typicalreaction mixture includes: 21 μl of DNA, 25 pmol of oligonucleotideprimer, 2.5 μl of 10×PCR buffer 1 (Perkin-Elmer, Foster City, Calif.),0.4 μl of 1.25 μM dNTP, 0.15 μl (or 2.5 units) of Taq DNA polymerase(Perkin Elmer, Foster City, Calif.) and deionized water to a totalvolume of 25 μl. Mineral oil is overlaid and the PCR is performed usinga programmable thermal cycler. The length and temperature of each stepof a PCR cycle, as well as the number of cycles, is adjusted inaccordance to the stringency requirements in effect. Initialdenaturation of the template molecules normally occurs at between 92° C.and 99° C. for 4 minutes, followed by 20-40 cycles consisting ofdenaturation (94-99° C. for 15 seconds to 1 minute), annealing(temperature determined as discussed below, 1-2 minutes), and extension(72° C. for 1 minute). Final extension is generally for 4 minutes at 72°C., and may be followed by an indefinite (0-24 hour) step at 4° C.

[0190] Annealing temperature and timing are determined both by theefficiency with which a primer is expected to anneal to a template andthe degree of mismatch that is to be tolerated. In attempting to amplifya mixed population of molecules, the potential loss of molecules havingtarget sequences with low melting temperatures under stringent(high-temperature) annealing conditions against the promiscuousannealing of primers to sequences other than their target sequence isweighed. The ability to judge the limits of tolerance for feature lossvs. the inclusion of artifactual amplification products is within theknowledge of one of skill in the art. An annealing temperature ofbetween 30° C. and 65° C. is used. An example of one primer out of thepool of 4096 primer 1, one primer (primer 1 ex) is shown below, as is aDNA sequence from the preparation with which primer 1 ex has high 3′ endcomplementarity at a random position. The priming site is underlined oneither nucleic acid molecule. primer 1ex [SEQ ID NO: 7; bases 1-32]:5′-gcagcagtacgactagcataTCCGAC ctgcgt-3′ genomic DNA [SEQ ID NO: 6]:3′-tttcgacgcacatcgcgtgcatggccccatgcatcaggctgacgaccgtcgtacgtctactcggct-5′

[0191] After priming, polymerase extension of primer 1 ex on thetemplate results in: [SEQ ID NO: 7]5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtaccggggtacgtagtccgactgctggcagcatgcagatgagccga-3′

[0192] Out of the pool of 4096 primer 2, one primer with high 3′ endcomplementarity to a random position in the extended primer 1 ex DNA isselected by a polymerase for priming (priming site in bold): [SEQ ID NO:7]  5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtaccggggtacgtagtcc                gactgctggcagcatgcagatgagccga 3′ primer 2ex [SEQ ID NO:8; bases 1-32]: 3′-gacgacCAGCCTggacgtacgatgacgatagc-5′

[0193] After priming and synthesis, the resulting second strand is: [SEQID NO: 8]3′-cgtcgtcatgctgatcgtatAGGCTGgacgcacatcgcgtgcatggccccatgcatcaggctgacgacCAGCCTggacgtacgatgacgatagc-5′

[0194] Primer 3, shown below, is a 26-mer that is identical to theconstant region of primer 1ex: [SEQ ID NO: 7; nucleotides1-26]  5′-gcagcagtacgactagcataTCCGAC-3′

[0195] It is immobilized by a 5′ acrylyl group to a polyacrylamide layeron a glass slide. Primer 4, below, is a 26-mer that is complementary tothe constant region of primer 2ex: [SEQ ID NO: 8; nucleotides1-26]  5′-cgatagcagtagcatgcaggTCCGAC-3′

[0196] It is optionally immobilized to the polyacrylamide layer by a 5′acrylyl group.

[0197] The pool of amplified molecules derived from the sequentialpriming of the original nucleic acid preparation with mixed primers 1and 2, including the product of 1ex/2ex priming and extension, arehybridized to immobilized primers 3 and 4. In situ PCR is performed asdescribed above, resulting in the production of a first random,immobilized array of nucleic acid molecules according to the invention.This array is replicated by the methods described in Example 1 in orderto create a plurality of such arrays according to the invention.

[0198] After in situ PCR using primers 3 and 4:5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtaccggggtacgtagt3′-cgtcgtcatgctgatcgtatAGGCTGgacgcacatcgcgtgcatggccccatgcatcaccgactgctgGTCGGAcctgcatgctactgctatcg-3′  [SEQ ID NO: 9]ggctgacgacCAGCCTggacgtacgatgacgatagc-5′  [SEQ ID NO: 8]

[0199] After cutting with MmeI and removal of the non-immobilizedstrands: [SEQ ID NO: 9; bases 1-46]5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtacc-3′ (primer 1-based,clone-specific oligonucleotide) [SEQ ID NO: 8; bases 1-46]3′-ccatgcatcaggctgacgacCAGCCTggacgtacgatgacgatagc-5′ (primer 2-based,clone-specific oligonucleotide)

[0200] The resulting random arrays of oligonucleotide primersrepresenting the nucleic acid sequences of the original preparation areuseful in several ways. Any particular feature, such as the above pairof primers, is used selectively to amplify the intervening sequence (inthis case two central bp of the original 42 bp cloned segment arecaptured for each use of the chip or a replica) from a second nucleicacid sample. This is performed in solution or in situ, as describedabove, following feature identification on the array, using free,synthetic primers. If desired, allele-specific primer extension orsubsequent hybridization is performed.

[0201] Importantly, this technique provides a means of obtainingcorresponding, or homologous, nucleic acid arrays from a second cellline, tissue, organism or species according to the invention. Theability to compare corresponding genetic sequences derived fromdifferent sources is useful in many experimental and clinicalsituations. By “corresponding genetic sequences”, one means the nucleicacid content of different tissues of a single organism or tissue-culturecell lines. Such sequences are compared in order to study the cell-typespecificity of gene regulation or mRNA processing or to observechromosomal rearrangements that might arise in one tissue rather thananother. Alternatively, the term refers to nucleic acid samples drawnfrom different individuals, in which case a given gene or its regulationis compared between or among samples. Such a comparison is of use inlinkage studies designed to determine the genetic basis of disease, inforensic techniques and in population genetic studies. Lastly, it refersto the characterization and comparison of a particular nucleic acidsequence in a first organism and its homologues in one or more otherorganisms that are separated evolutionarily from it by varying lengthsof time in order to highlight important (therefore, conserved)sequences, estimate the rate of evolution and/or establish phylogeneticrelationships among species. The invention provides a method ofgenerating a plurality of immobilized nucleic acid arrays, wherein eacharray of the plurality contains copies of nucleic acid molecules from adifferent tissue, individual organism or species of organism.

[0202] Alternatively, a first array of oligonucleotide primers withsequences unique to members of a given nucleic acid preparation isprepared by means other than the primed synthesis described above. To dothis, a nucleic acid sample is obtained from a first tissue, cell line,individual or species and cloned into a plasmid or other replicablevector which comprises, on either side of the cloning site, a type IISenzyme recognition site sufficiently close to the junction betweenvector and insert that cleavage with the type IIS enzyme(s) recognizingeither site occurs within the insert sequences, at least 6 to 10,preferably 10 to 20, base pairs away from the junction site. It iscontemplated that type IIS restriction endonuclease activity may evenoccur at a distance of up to 30 pairs from the junction site. Thenucleic acid molecules are cleaved from the vector using restrictionenzymes that cut outside of both the primer and oligonucleotidesequences, and are then immobilized on a semi-solid support according tothe invention by any of the methods described above in which covalentlinkage of molecules to the support occurs at their 5′ termini, but doesnot occur at internal bases. Cleavage with the type IIS enzyme (such asMmeI) to yield the immobilized, sequence-specific oligonucleotides isperformed as described above in this Example.

[0203] As mentioned above, it is not necessary to immobilize primer 4 onthe support. If primer 4 is left free, the in situ PCR products yieldthe upper (primer 1 derived) strand upon denaturation:5′-gcagcagtacgactagcataTCCGACctgcgtgtagcgcacgtaccggggtacgtagtcc [SEQ IDNO: 9] gactgctgGTCGGAcctgcatgctactgctatcg-3′.

[0204] This sequence is available for hybridization tofluorescently-labeled DNA or RNA for mRNA quantitation or genotyping.

EXAMPLE 7

[0205] Isolation of a Feature from an Array of the Invention (Method 2)

[0206] As described above, laser-capture microdissection is performed inorder to help orient a worker using the arrays of a set of arraysproduced according to the invention, or to remove undesirable featuresfrom them. Alternatively, this procedure is employed to facilitate thecloning of selected features of the array that are of interest. Thetransfer of the nucleic acid molecules of a given feature or group offeatures from the array to a thin film of EVA or another heat-sensitiveadhesive substance is performed as described above. Following thosesteps, the molecules are amplified and cloned as follows:

[0207] The transfer film and adherent cells are immediately resuspendedin 40 μl of 10 mM Tris-HCl (pH 8.0), 1 mM EDTA and 1% Tween-20, andincubated overnight at 37° C. in a test tube, e.g. a polypropylenemicrocentrifuge tube. The mixture is then boiled for 10 minutes. Thetubes are briefly spun (1000 rpm, 1 min.) to remove the film, and 0.5 μlof the supernatant is used for PCR. Typically, the sheets of transferfilm initially applied to the array are small circular disks (diameter0.5 cm). For more efficient elution of the after LCM transfer, the diskis placed into a well in a 96-well microliter plate containing 40 μl ofextraction buffer. Oligonucleotide primers specific for the sequence ofinterest may be designed and prepared by any of the methods describedabove. PCR is then performed according to standard methods, as describedin the above examples.

EXAMPLE 8

[0208] Excluded Volume Protecting Groups

[0209] The density of features of the arrays is limited in that theymust be sufficiently separated to avoid contamination of adjacentfeatures during repeated rounds of amplification and replication. Thisis achieved using dilute concentrations of nucleic acid pools, butresults in density limited by the Poisson distribution to a maximum of37% occupancy of available appropriately spaced sites. In order toincrease the density of features while maintaining the spacing necessaryto avoid cross contamination, the following approach may be taken.

[0210] An activity which can bind the nucleic acid molecules of the poolis positioned in spots on the surface of the array support to create acapture array. The spots of the capture array are arranged such thatthey are separated by a distance greater than the size of the spots(this is typically near the resolution of the intended detection andimaging devices, or approximately 3 microns). The size of the spots isset to be less than the diameter of the excluded volume of the nucleicacid polymer to be captured (for example, approximately one micron for50 kb lambda DNA in 10 mM NaCl; please see Rybenkov et al., 1993, Proc.Natl. Acad. Sci. U.S.A. 90: 5307-5311, Zimmerman & Trach, 1991, J. Mol.Biol. 222: 599-620, and Sobel & Harpst, 1991, Biopolymers 31: 1559-1564,incorporated herein by reference, for methods of predicting excludedvolumes of nucleic acids.

[0211] The “nucleic acid capture activity” of the array may be ahydrophilic compound, a compound which reacts covalently with thenucleic acid polymers of the pool, an oligonucleotide complementary to asequence shared by all members of a pool (e.g., an oligonucleotidecomplementary to the 12 bp cohesive ends of a phage λ library, oroligonucleotide(s) complementary to one or both ends of a PCR-generatedlibrary containing large inserts and 6 to 50 bp of one strand exposed atone or both ends) or some other capture ligand including but not limitedto proteins, peptides, intercalators, biotin, avidin, antibodies orfragments of antibodies or the like.

[0212] An ordered array of nucleic acid capture ligand spots may be madeusing a commercially-available micro-array synthesizer, modified inkjetprinter (Castellino, 1997, supra), or the methods disclosed by Fodor etal. (U.S. Pat. No. 5,510,270), Lockhart et al. (U.S. Pat. No. 5,556,752)and Chetverin and Kramer (WO 93/17126). Alternatively, details on thedesign, construction and use of a micro-array synthesizer are availableon the World Wide Web at www.cmgm.stanford.edu/pbrown.

[0213] An excess of nucleic acid or DNA is then applied to the surfaceof the microfabricated capture array. Each spot has multiple chances tobind a free nucleic acid molecule. However, once a spot has bound anucleic acid molecule, it is protected from binding other molecules,i.e., the excluded volume of the bound DNA protects the spot frombinding more than one molecule from the pool. Thus, saturation binding,or a situation very close to it, may be achieved while retaining theoptimal spacing for subsequent amplification and replication.

[0214] The array resulting from this process may be amplified in situand replicated according to methods described herein. Alternatively, orin addition, the array may be treated in a way which decreases theexcluded volume of the captured group so that additional rounds ofexcluded volume protecting group (EVPG) addition may be performed.Arrays produced in this manner not only increase the efficiency of thearray beyond that normally allowed by the Poisson distribution, but alsocan be of predetermined geometry and/or aligned with othermicrofabricated features. In addition, such arrays allow complicatedhighly parallel enzymatic or chemical syntheses to be performed on largeDNA arrays.

EXAMPLE 9

[0215] Replica-destructive Amplification Methods

[0216] A major advantage of the replica amplification method is thatbecause there are multiple copies of a particular array, information isnot lost if a given replica is destroyed or rendered non-re-usable by aprocess. This allows the use of the most sensitive detection methods,regardless of their impact on the subsequent usefulness of thatparticular replica of the array. For example, tyramide-biotin/HRP (orother enzymatic in situ reactions) or biotin/avidin or antibody/haptencomplexes (or other ligand sandwiches) may be used to effectivelyamplify the signal in a nucleic acid hybridization (or other bimolecularbinding) experiment. These methods, however, may be considereddestructive to the DNA array in that they involve interactions which arekinetically difficult to disrupt without destroying the array.Similarly, some detection processes, including sequencing by ligationand restriction and the variant methods described herein (see Examples11 and 12), necessarily involve destruction, either chemically orenzymatically or both, of the template array. The availability ofreplica arrays made according to the methods disclosed herein allow theuse of these methods, as they destroy only the replica, not the originalor other copies.

[0217] The availability of replicas of an array allows the use of directfluorescent detection of probes hybridized to the array without loss ofthe array for subsequent uses. One method which this allows is therelative quantitation of mRNA by hybridization of the array withfluorescently labeled total cDNA probes. This method allows theevaluation of changes in the expression of a wide array of genes inpopulations of RNA isolated from cells or tissues in different growthstates or following treatment with various stimuli.

[0218] Fluorescently labeled cDNA probes are prepared according to themethods described by DeRisi et al., 1997, Science 278: 680-686 and byLockhart et al., 1996, Nature Biotechnol. 14: 1675-1680. Briefly, eachtotal RNA (or mRNA) population is reverse transcribed from an oligo-dTprimer in the presence of a nucleoside triphosphate labeled with aspectrally distinguishable fluorescent moiety. For example, onepopulation is reverse transcribed in the presence of Cy3-dUTP (greenfluorescence signal), and another reverse transcribed in the presence ofCy5-dUTP (red fluorescence signal).

[0219] Hybridization conditions are as described by DeRisi et al. (1997,supra) and Lockhart et al. (1996, supra). Briefly, final probe volumeshould be 10-12 μl, at 4× SSC, and contain non-specific competitors(e.g., poly dA, C₀T1 DNA for a human cDNA array) as required. To thismixture is added 0.2 μl of 10% SDS and the probes are boiled for twominutes and quick chilled for ten seconds. The denatured probes arepipetted onto the array and covered with a 22mm×22 mm cover slip. Theslide bearing the array is placed in a humid hybridization chamber whichis then immersed in a water bath (62° C.) and incubated for 2-24 hours.Following incubation, slides are washed in solution containing 0.2× SSC,0.1% SDS and then in 0.2× SSC without SDS. After washing, excess liquidis removed by centrifugation in a slide rack on microtiter platecarriers. The hybridized arrays are then immediately ready for scanningwith a fluorescent scanning confocal microscope. Such microscopes arecommercially available; details concerning design and construction of ascanner are also available on the World Wide Web atwww.cmgm.stanford.edu/pbrown.

[0220] In the above example in which one population of RNA wasreverse-transcription labeled with Cy3 and the other with Cy5fluorescent dyes, the relative expression of genes represented by thefeatures of the micro-array may be evaluated by the presence of green(Cy3, indicating the mRNA from this population hybridizes to a givenfeature), red (Cy5, indicating the mRNA from this population hybridizesto a given feature) or yellow (indicating that both mRNA populationsused to make probes contain mRNAs which hybridize to a given feature)fluorescent signals.

[0221] Alternatively, separate replicas of the same array may behybridized separately with probes labeled with the same fluorescent dyemarker but made from different populations of mRNA. For example, cDNAprobes made from cells before and after treatment with a growth factormay be hybridized with separate replicas of a genomic array made fromthose cells. The intensity of the signal of each feature may be comparedbefore and after growth factor treatment to yield a representation ofgenes induced, repressed, or whose expression is unaffected by thegrowth factor treatment. This method requires that the replica arrayscontain one or more markers which will not vary as a means of aligningthe hybridized arrays. Such a marker may be a foreign or synthetic DNA,for example. The RNA corresponding to such a marker is spiked at equalconcentration into the reverse transcription reactions used to generatelabeled cDNA probes. Prior to the first hybridization with experimentalcDNAs, a control hybridization using only the marker cDNA may beperformed on a replica array to precisely determine the position(s) ofthe marker(s) within the array.

[0222] In either the simultaneous hybridization or the separatehybridization methods, the availability of additional replicas of thearray allows further characterization (including but not limited tosequencing and isolation of the gene represented by the feature) ofthose features of the array which exhibit particular expressionpatterns.

EXAMPLE 10

[0223] Geometrical Focusing

[0224] A characteristic of the replica amplification process is thateach replica will tend to occupy a larger area than the feature fromwhich it was made. This is because the feature molecules transferred tothe replica may come from anywhere within the circumferential areaoccupied by the template feature. Subsequent amplification of thetransferred molecules will necessarily increase the area occupied by thefeature relative to that occupied by the template feature. It is clearthat this phenomenon will limit the practical number of times an arraymay be sequentially replicated without contamination of surroundingfeatures. There are several approaches to solving this problem.

[0225] First, as mentioned previously, more than one replica of anamplified array may be made per amplification. It is clear that the“earlier” in the replication process a given array is replicated, theless area its features will occupy relative to those made later. Thatis, the more replicas one can make of an original amplified array beforere-amplifying the template, the more arrays with smaller features onewill have. The number of replicas of a given array which may be madewithout re-amplification of the template may be determined empiricallyby, for example, hybridization of a sequential series of amplifiedreplicas from a single array with an oligonucleotide which hybridizeswith a sequence common to every feature. Comparison of the hybridizationsignals from the first replica to those of subsequent replicas made fromthe same template without re-amplification of the template will indicateat what point features begin to be lost from the replicas.

[0226] Second, one may reduce the number of PCR cycles used in theamplification process. Because the amplification is exponential, a smallchange in the cycle number can have a profound influence on the areaoccupied by the feature. This will clearly not solve the problemcompletely, but when combined with the first approach it can extend theuseful number of cycles of amplification and replication for a givenarray. The practical number of PCR cycles to use for each round ofamplification may also be estimated empirically by making severalreplicas from a single template array without re-amplification, and thensubjecting individual replicas in the series to increasing numbers ofPCR cycles. For example, replicas may be subjected to 10, 20, and 30amplification cycles, followed by hybridization with a fluorescent probesequence common to all features of the array. Visualization of thehybridized array by fluorescence microscopy will indicate at which pointthe features begin to intrude upon one another. Clearly, the startingsize of the feature will influence the number of PCR cycles allowableper replication cycle, but it is within the ability of one skilled inthe art to determine generally how many cycles are optimal to obtainenough DNA for subsequent rounds of replica amplification withoutwidespread contamination of surrounding features.

[0227] A third approach recognizes the fact that the amplified featuresoccupy more than just the two dimensional area of the surface they situpon. Rather, each amplified feature occupies a hemishperical space witha radius, r. If the features are situated on one slide, which fordiscussion will be designated the “bottom” slide, and covered by anotherslide (the “top” slide) set at a uniform, fixed distance from the bottomslide, one will note that as the hemishperical feature expands withrounds of amplification, the portion of the growing hemisphere whichfirst contacts the top slide will be much smaller in cross-sectionalarea than the portion in contact with the bottom slide. This presents asmaller surface area, with all sequence information intact, from whichto make replicas that do not occupy greater surface area than theirtemplate features. This method will be referred to as “geometricalfocusing.”

[0228] For example, after 30 cycles in 15% polyacrylamide, 500 bpamplicons will form hemispheres with a 10 micron radius. The length ofthe template and the percentage of acrylamide in the gel influence thesize of the amplified features such that, for a given number of cycles,the size of the features decreases as the length of the template or thepercentage of acrylamide increases. In general, the size of an amplifiedfeature with respect to a given number of amplification cycles undergiven conditions is determined empirically by visualizing it with afluorescent confocal microscope or fluorimager after staining with afluorescent intercalator. Labeled primers or nucleotides may also beused to “light up” the feature for measurement by this method.

[0229] The distance between the surface bearing the array and thesurface the array is to be transferred to may be controlled usingplastic spacers of the desired thickness along the edges of the slide. Asmall volume of polyacrylamide solution plus capillary action will takethe volume out to the edges of a predetermined area of coverslip.

[0230] Another contemplated method of regulating or controlling thedistance between surfaces in the geometrical focusing method involvesthe use of optical feedback, such as Newton rings or otherinterferometry, to adjust pressure locally across the surfaces. Theadjustment may be accomplished by a scanning laser that heats adifferential thermal expansion plate differentially based on the opticalfeedback.

[0231] As mentioned above, bioactive substances such as enzymes may becast directly in polyacrylamide gels. Other reagents, including buffersand oligonucleotide primers may be either cast into the gels or added bydiffusion or even electrophoretic pulses to the pre-formed gel matrices.If the upper plate has little or no adhesiveness to the gel (achieved,for example, through silane coating as described above), then when it isremoved, the upper circle of each hemisphere is the only exposed DNA.Some of the exposed DNA can be transferred by microcontact printingusing either plate, or by another round of polymerization from the upperplate. The radius of the circle exposed for transfer will bec=sqrt(r²-d²), where r is the radius of the hemisphere and d is thedistance between the plates. Therefore, when r=10 microns and d=8microns, the radius of the exposed circle, c=6 microns, less than thesize of the template feature. This exposed circle will thus have across-sectional area less than that occupied by the template feature,referred to as q, at the surface of the support. This slight reductionin the radius, and consequently the cross-sectional area of thetransferred feature will work to keep the amplified replica featuressharper through several rounds of replication. The distance between theplates may be 10%, 20%, 30%, 40%, on up to 50% or more less than theradius of the features being transferred. The surface area (of thesupport) occupied by the transferred features may be considered reducedor lessened if it is 10%, 20%, 30%, 40%, on up to approximately 80% lessthan the area occupied by features on the template array. The resolutionof the features is considered to be preserved if the features remainessentially distinct after amplification of the transferred nucleicacid. It is noted that features which amplify with lower efficiency thanothers may be lost if the distance between plates is too large.Therefore, geometrical focusing will be most useful when combined withthe other two approaches described for limiting the size of amplifiedreplicas. That is, the number of replicas made from individual arraysearly in the process should be maximized while the number of PCR cyclesper amplification should be minimized.

EXAMPLE 11

[0232] Replica Sequencing with Ligation/restriction Cycles

[0233] The sequencing by ligation and restriction method of Brenner, asdescribed above, provides a powerful approach to the simultaneoussequencing of entire arrays of DNA molecules. The ability to replicatethe entire array provides a novel approach to improving the efficiencyof the sequencing method. In its standard format, the number of basessequenced by the ligation and restriction method is limited by abackground of molecules which fail to ligate or cleave properly in agiven cycle. This phenomenon disturbs the synchrony of the process andlimits the effective lengths which may be sequenced by this method sincethe interference it introduces is cumulative.

[0234] The sequencing by ligation and restriction method as disclosed byBrenner addresses this issue by the optional inclusion of a “capping”step after the unligated probe has been removed. According to thatmethod, when the target molecules have a 5′ protruding end, a mixture ofdideoxynucleoside triphosphates and a DNA polymerase is added prior tothe next cleavage step. This results in the addition of a singledideoxynucleotide to the 3′ terminus of the recessed strand which willprevent subsequent ligation steps, effectively deleting the moleculewhich failed to be ligated from the target population. The effectivenessof the capping method is dependent on the completeness of the capaddition.

[0235] An improvement on the method of sequencing by ligation andcleavage involves the use of two or more distinct probes comprisingdifferent “ligation cassettes” coupled with a round of replicaamplification by PCR wherein one of the primers is specific to the mostrecently added ligation cassette. This method will be referred to as“replica sequencing with ligation and restriction cycles.” A probe ofuse in this method is a double-stranded polynucleotide which (i)contains a recognition site for a nuclease, (ii) typically has aprotruding strand capable of forming a duplex with a complementaryprotruding strand of the target polynucleotide, and (iii) which has asequence, the “ligation cassette,” such that an oligonucleotide primercomplementary to one such sequence or cassette will allow amplificationof the molecule to which it is ligated under the conditions used forannealing and extension within the method.

[0236] In each sequencing cycle, only those probes whose protrudingstrands form perfectly-matched duplexes with the protruding strand ofthe target polynucleotide hybridize and are then ligated to the end ofthe target polynucleotide. The probe molecules are divided into fourpopulations, wherein each such population comprises one of the fourpossible nucleotides at the position to be determined, each labeled witha distinct fluorescent dye. The remaining positions of theduplex-forming region are occupied with randomized, unlabeled bases, sothat every possible multimer the length of that region is represented;therefore, a certain percentage of probe molecules in each pool arecomplementary to the single-stranded region of the targetpolynucleotide; however, only one pool bears labeled probe moleculesthat will hybridize.

[0237] The individual probes comprising different ligation cassettes mayhave a recognition sequence for the same or different type IIsrestriction endonuclease. The important factor is that the ligationcassette sequences, due to their distinct primer bindingcharacteristics, allow amplification of only those target moleculeswhich were successfully ligated in the previous ligation step. This alsoenforces the requirement for completing the cleavage step, as thosetarget molecules which were not cleaved in the previous step willsimilarly not be amplified, since they will not bear the proper primer.This process enriches the proportion of each feature which hassuccessfully completed the most recent cycle of ligation andrestriction. Through the reduction in background due to improvedsynchrony, this method increases the number of bases which can besequenced for features on a given array. The added steps of thereplication and subsequent re-amplification of the array not onlyfurther enrich for sequences which are in synchrony, but also conferscontrol over the size of the features, as described herein in thesection entitled “Geometrical Focusing”. As discussed in that section,control over the size of the features with increasing numbers ofamplification or replication cycles allows more sequence or otherinformation to be gleaned from a given array before features begin tooverlap.

[0238] After a cycle of cleavage, ligation of a first ligation cassette,and subsequent detection of the next base in the sequence, the steps onewill perform in applying the replica amplification process to thismethod of sequencing are as follows: 1) using primers, one complementaryto the common end (arbitrarily designated the 5′ end, for thisdiscussion) of the features being sequenced, and the other complementaryto the most recently added ligation cassette, the features of the arrayare amplified and then replicated according to methods described hereinabove; 2) a replica is then subjected to a new cycle of cleavage,ligation of a probe comprising a distinct ligation cassette, anddetection of the next base in the sequence; 3) the features of the arrayare amplified using the primer complementary to the common 5′ end of thefeatures and a primer complementary to the distinct ligation cassette,followed by replication of the array; and 4) the process of steps 1-3 isrepeated until the sequences of the features are determined.

[0239] Within the method of replica sequencing with ligation andrestriction cycles, a new probe comprising a distinct ligation cassettesequence may be used for each cycle of ligation and restriction.Alternatively, fewer different ligation cassettes than the number ofcycles of ligation and restriction may be used. In other words, as fewas two and as many as n (where n equals the number of cycles of ligationand restriction) different ligation cassettes may be of use according tothe method. As used herein, “new” or “different” or “distinct” whenreferring to probes or ligation cassettes comprised by probes is meantto indicate that the sequence of each ligation cassette, or theoligonucleotide probe comprising it, is such that a primer complementaryto the ligation cassette will not hybridize with any other cassette oroligonucleotide comprising a cassette under the conditions used forannealing and polymerization. Clearly, the greater the number ofdifferent ligation cassettes used, the more strictly the requirement forcompletion of previous cycles will be enforced. It is within the abilityof one of skill in the art to determine how many different ligationcassettes are required to achieve a desired level of synchrony (with aconcomitant reduction in background). As a general guideline, since thebackground due to incomplete cycles is cumulative, the number ofligation cassettes will vary in proportion to the desired number ofbases to be sequenced. One would, for example, expect to use a largernumber of different ligation cassettes if 300 bases are to be sequencedthan one would use to sequence 30 bases.

[0240] Replication of the arrays in the method of replica sequencing byligation and restriction may be performed as often as every cycle, onceevery nth cycle (where n is greater than 1), or even once per whole setof cycles. Again, the frequency of replication may be determined by oneskilled in the art. Considerations include, but are not limited to thephysical size of the features and the overall desired number of bases tobe sequenced.

[0241] The method of Jones, 1997, Biotechniques 22: 938-946 teaches theuse of PCR amplification to positively select for those molecules in apopulation which had successfully completed the previous cycle ofcleavage and ligation. Jones did not, however, teach the replication ofamplified populations or the application of the method to random arraysof features. Rather, Jones taught the use of microwell plates and arobotic pipetting apparatus to perform his method. An importantadvantage of the incorporation of the replication step into thesequencing method is that it allows control over the size of theamplified features. While Jones mentions the eventual application of hismethod to the “biochip” format, no guidance is given which would allowone to overcome the inherent limitation on the size of the features in amethod incorporating PCR amplification steps on a microarray. Incontrast, novel methods based on the replication of arrays, such asgeometrical focusing, are described herein which overcome thislimitation.

EXAMPLE 12

[0242] Non-replica Sequencing

[0243] Methods allowing determination of DNA sequences on an array thatdo not involve replica production are also preferred for someapplications. For example, sequencing of transcription products (ortheir reverse transcripts) in situ requires that the fine resolution ofthe sequencing templates be preserved.

[0244] One may use the method of Jones (1997, supra) to sequencefeatures on an array without replicating the array. Othernon-electrophoretic methods which might be adapted to sequencing ofmicroarrays include the single nucleotide addition methods ofminisequencing (Canard & Sarfati, 1994, Gene 148: 1-6; Shoemaker et al.,1996, Nature Genet. 14: 450-456; Pastinen et al., 1997, Genome Res. 7:606-614; Tully et al., 1996, Genomics 34: 107-113; Jalanko et al., 1992,Clin. Chem. 38: 39-43; Paunio et al., 1996, Clin. Chem. 42: 1382-1390;Metzker et al., 1994, Nucl. Acids Res. 22: 4259-4267) and pyrosequencing(Uhlen & Lundeberg, U.S. Pat. No. 5,534,424; Ronaghi et al., 1998,Sciencc 281: 363-365; Ronaghi et al., 1999, Anal. Biochem. 267: 65-71).more cycles.

[0245] A modified embodiment of FISSEQ that allows longer effectivereads involves extension for a fixed number of cycles with mixtures ofthree native (unlabeled) dNTPs interspersed with pulses of wash, up to adesired length. Following this, one begins cycles of adding onepartially labeled (i.e., mixture of labeled and unlabeled) dNTP at atime. The triple dNTP cycles allow positioning of the polymerase a fixeddistance from the primer and would use alternating sets of triphosphates(e.g., ACG, CGT, ACG, . . . ) chosen and concentration optimized toreduce false incorporation and failure to incorporate (Hillebrand etal., 1984, Nucl. Acids. Res. 12: 3155-3171). This allows three timeslonger reads plus any advantage possibly conferred by having fewerpotential misincorporation steps. It is contemplated that if themisincorporation rate (n−1 and extensible n+1 products) can be as low as10⁴, then read lengths longer than current electrophoresis-based methodsare possible.

[0246] Another modification using the triple dNTP cycles is aimed atreducing the background caused by mismatch incorporation. If, forexample, G:T mismatch pairing is a major source of misincorporation(Keohavong et al., 1993, PCR Meth. Appl. 2: 288-292), one should alwaysinclude A with G, since the more stable A:T interaction will be favoredover the less stable G:T interaction. For example, one may alternatetriple mix 1 (dATP, dCTP, dGTP) with triple mix 2 (dCTP, dGTP, dTTP).

[0247] A more conservative version of FISSEQ which can allowdetermination of longer stretches of sequences at a time requiresreplicas of the array, and will be referred to as replica-FISSEQ.Replica arrays for this method may be made by the replica amplificationmethods described herein, or by a microarray spotting method using amicroarray robot. By spotting the same DNA templates in known positionson the slide, the same effect can be obtained as with the

[0248] As an alternative to minisequencing or pyrosequencing, the novelmethod of fluorescent in situ sequencing extension quantification(FISSEQ) may be used. FISSEQ involves the following steps: 1) a mixtureof primer, buffer and polymerase are added to a microarray of singlestranded DNA; 2) a single, fluorescently labeled base is added to themixture, and will be incorporated if it is complementary to thecorresponding base on the template strand; 3) unincorporated dNTP iswashed away; 4) incorporated dNTP is detected by monitoringfluorescence; 5) steps 2-4 are repeated (using fresh buffer andpolymerase) with each of the four dNTPs in turn; and 6) steps 2-5 arerepeated in cycles until the sequence is known.

[0249] It is recognized that polymerases used for sequencing becomeinefficient for further extension when 100% of bases added to a primerare non-native (i.e., fluorescently labeled). Therefore, the efficiencyof FISSEQ may be further improved by employing a mixture of native andfluorescently labeled dNTP. The mixture allows incorporation of labeledbases at each position without requiring 100% adjacent non-native bases.Also, a photobleaching step after each set of one or more cycles may beincorporated to allow the background subtraction to act on a smallernumber, with corresponding lower Poisson shot noise.

[0250] As an alternative to photobleaching or computational subtractionof accumulating fluorescence, it is contemplated that cleavable linkagesbetween the fluorophore and the nucleotide may be employed. Cleavage maybe accomplished, for example, by acid or base treatment, or by oxidationor reduction of the linkage. For example, a disulfide linkage may bereduced using thiol compounds such as dithiothreitol. Similarly, acis-glycol linkage can be cleaved by periodate. These are examples ofstandard components of cleavable cross-linkers used for proteinchemistry or for polyacrylamide gels. In this embodiment, cleavage couldbe done as often as every cycle, or less frequently, such as everyother, every third, or every fifth or replica-amplified features. Inthis embodiment, 30 identical arrays are made using the microarrayrobot. Stepping through 1 to 30 additions with native (unlabeled) dNTPssets up the final base to be assessed for each array element. The finalbase is assessed by the sequential addition of each fluorescent dNTP asis normally done in minisequencing. Pyrosequencing data (Ronaghi et al.,1998, Science 281: 363) has shown that the polymerase extensionreactions stay accurately in phase through at least 30 cycles of dNTPaddition using natural nucleotides and Klenow exo- polymerase. To readout N bases with the single slide method described above requires 4Ncycles of nucleotide addition and washing. The N-slide (triple dNTP, 4cycles per slide) method (using N replicas), requires 2N(N-1)/3 cycles.The actual read lengths will be more than N bases (1.4N on average dueto runs of identical bases). The same number of scans are required forthe two methods.

[0251] Several other modifications to the basic method of FISSEQ arecontemplated. For example, a loop may be incorporated into the primer tohelp reduce mispriming events (Ronaghi et al., 1998, Biotechniques 25:876-878, 880-882, and 884). A particularly useful loop structure,described by Hirao et al. (1994, Nucl. Acids Res. 22: 576-582) as“extraordinarily stable,” would have the advantage of having arelatively short stem, lowering the stability of the complementarystrand hairpin, the result being that the asymmetric PCR for the strandthat we want will extend to the correct end more efficiently.

[0252] Another modification would address the difficulty, encountered inmany methods, of sequencing past long repeating stretches. If it isknown that a given array contains many such sequences, one may include adefined regimen (for example, halfway through the whole sequence) ofdeoxy- and dideoxynucleotides to reduce out-of-phase templates. That is,if one knows he or she is sequencing through a repeat of, for example,AC dinucleotides, one may reduce the number of out-of-phase molecules byfollowing a dATP addition with a ddATP addition. Only those moleculeswhich failed to incorporate the deoxy- form of the nucleotide will beavailable to incorporate the dideoxy- form, leading to chain terminationand reduction of that source of background. Clearly, similar regimensmay be devised for repeats involving more than two nucleotides. Itshould be noted that the strategy is not limited to repeats and may beused to extend read length in any situation where most of the sequencesin the array have a block of sequence part of the way through the targetsequence which is known. For example, in an array of targets, mosthaving the unique sequence ACGTA at the same distance from the primer,one may reduce the number of out-of-phase molecules by following a dATPaddition with a ddATP, ddGTP, and ddTTP addition, then dCTP followed byddATP, ddCTP, and ddTTP addition.

EXAMPLE 13

[0253] Gel Sequencing of Amplified Array Features Using Dye Terminators

[0254] In addition to the methods of sequencing by hybridization andsequencing by ligation and restriction, it is possible to sequenceamplified features of arrays using fluorescently labeleddideoxynucleoside triphosphates (“dye terminators”) using the Sanger(“dideoxy”) sequencing method (Sanger et al., 1975, J. Mol. Biol.,94:441) and a micro gel system. In this embodiment, the array ofamplified features is created in a linear arrangement along one edge ofa very thin slab gel or at the edge of a microfabricated array ofcapillaries. DNA molecules of the pool to be sequenced are prepared inany of the same ways as for the random array spot format describedabove, such that each molecule in the pool has a known sequence orsequences at one or both ends which may serve as primer binding sites.The DNA is applied to the slide as in the random array format, exceptthat it is restricted to a thin line, rather than a circular spot.Alternatively, the DNA may be derived as a replica of a line within astandard 2D array, or may be derived as a replica of a line from ametaphase chromosome spread.

[0255] Features of the deposited linear array are then amplified usingany of the methods described above for amplification of spot arrays.This amplification may be linear or exponential, thermocycled orisothermal. Isothermal amplification methods include the Phi29 rollingcircle amplification method (Lizardi et al., 1998, Nature Genetics 19:225-232), reverse transcriptase/T4 DNA polymerase/Klenow/T7 RNApolymerase linear amplification (Phillips and Eberwine, 1996, Methods10: 283-288) and a T7 DNA polymerase/thioredoxin/ssb system (Tabor andRichardson, January 1999 Department of Energy Human Genome ProgramAbstract No. 15).

[0256] The amplified DNA template may be replicated using the methodsdescribed above. This template, which is immobilized either covalently,by entanglement, or by steric hindrance of the gel (or other semi-solid)is then reacted with dye terminators in the presence of the othernecessary components of the dideoxy sequencing method (i.e., primer,dNTPs, buffer and polymerase). It is well known in the art that a numberof polymerases may be used for dideoxy-sequencing, including but notlimited to Klenow polymerase, Sequenase™ or Taq polymerase. A majoradvantage of dye terminators over fluorescently labeled primers (“dyeprimers”) is that the use of dye terminators requires only one reactioncontaining four distinguishably labeled terminators, whereas the use ofdye primers requires four separate reactions which would require fouridentical amplified features and software alignment of thepost-size-separation pattern. It should be noted that dye terminatorsalso exist for RNA polymerase sequencing (Sasaki et al., 1998, Proc.Natl. Acad. Sci. USA 95: 3455-3460). It should also be noted that if thetermination reactions have been performed with the use of primers, thena rare-cutting endonuclease may be used to produce a desired end for thesequencing ladder.

[0257] A miniature gel system appropriate for the gel sequencing oflinear feature arrays has been described by Stein et al., 1998, Nucl.Acids. Res. 26: 452-455. In this system, small, ultrathin polyacrylamidegels are cast, eight or more at a time, on standard microscope slides.The gels may be stored, ready to use, for approximately two weeks. Theyare run horizontally in a standard mini-agarose gel apparatus, withtypical run times of 6 to 8 minutes. Stein et al. describe a novelsample loading system which permits volumes as low as 0.1 μl to beanalyzed. The band resolution compares favorably with that oflarge-format sequencing gels. Within the context of the sequencing oflinear arrays according to the invention, the sample loading isaccomplished by performing the termination reactions within, or at thevery edge of the gel, rather than by mechanical means.

[0258] Since the terminated reaction products remain bound to thetemplate, the reaction may be cleaned of dNTPs, primers and salts bydiffusion, flow and/or electrophoresis. The termination products arethen denatured and electrophoresed perpendicular to the line ofamplified features in a thin slab or capillary format. An importantaspect of this method is that the order of the amplified features ispreserved throughout the process. Thus, if the line of features comesfrom a chromosome or large cloned or uncloned DNA fragment, the longrange order is preserved and greatly aids in the assembly of complexgenomic regions even in the presence of long repeats. Similarly, if thelines of features are derived as replicas of lines from the standard 2Darrays, the sequence identity of each spot in that line may bedetermined. Similar replicas of additional lines from the 2D spot may beused to determine the identity of each spot or feature of the 2D array.In addition to the clear advantages regarding the spatial organizationof the features, this method has the additional advantage of actuallyusing more of the sequencing reaction than other methods. That is, allof the reaction products are electrophoresed, rather than just a portionof it, meaning there is less waste of reagents. Further, theimmobilization of the features allows the use of a common pool ofreagents to sequence many features simultaneously. Thus, the method ismore economical on a per sequence basis.

EXAMPLE 14

[0259] Multiplex PCR

[0260] Multiplex PCR refers to the process of amplifying a number ofdifferent DNA molecules in the same PCR reaction. Generally, the processinvolves the addition of multiple primer pairs, each pair specific forthe amplification of a single DNA target species. A major goal ofinvestigators is to apply the power of multiplex PCR to the problem ofhigh throughput genotyping of individuals for specific genetic markers.If 100,000 polymorphic markers are to be assayed per genome, it would bevery expensive to perform 100,000 individual PCR reactions. Someadvances have been made in multiplexing PCR reactions (Chamberlain etal., 1988, Nucl. Acids Res. 16:11141), and the degree of multiplexing ofthe PCR has been scaled up, followed by hybridization to an array ofallele-specific probes (Wang et al., 1998, Science 280: 1077). However,in the studies by Wang et al., the percentage of PCR products thatsuccessfully amplified decreased as the number of PCR primers added tothe reaction increased. When approximately 100 primer pairs were used,about 90% of the PCR products were successfully amplified. When thenumber of primer pairs was increased to about 500, about 50% of the PCRproducts were successfully amplified.

[0261] The decreasing efficiency with increasing number of primers isdue in large part to the phenomenon of “primer dimer” formation. Primerdimers are the result of fortuitous 3′ terminal complementarity of 4 bpor more between primers. This complementarity allows hybridization whichis stabilized by polymerase recognition and extension of both strands.After the first cycle of extension, the complementarity is no longerlimited to the 3′ terminal nucleotides; rather, the entire primer dimeris now complementary to the primers. This reaction efficiently competeswith the desired amplification reaction, in part because theconcentration of the primers is significantly greater than that of thedesired amplification target, kinetically favoring the amplification ofthe primer dimers. This phenomenon increases with increasing numbers andconcentrations of primers.

[0262] A new approach to solving these inherent problems with multiplexPCR uses microarrays of immobilized, amplified PCR primers. Byimmobilizing at least one of the PCR primers, the method reduces thepossibilities for non-specific primer interactions. The localconcentration of primers is high enough for amplification, yet theindividual primers are restricted from interacting non-specifically withone another.

[0263] Another disadvantage of standard multiplex PCR is that individualprimer pairs must be synthesized for each polymorphic target. GenotypingDNA with 100,000 polymorphism targets would require, in theory, 200,000different PCR primers. Not only is the synthesis of such primers costlyand time consuming, but not all primer designs succeed in producing adesired PCR product. Therefore considerable time and energy will bespent optimizing the primer designs.

[0264] According to the new multiplex PCR method, one of the primers hasa 5′ end which is generic for the entire multiplex PCR reaction, suchthat the entire multiplex reaction will have that segment on the“mobile” primer. This 5′ generic sequence may contain a restriction sitefor later cloning, a bacteriophage or other promoter for transcriptionof the products, or some other useful or identifiable sequence. The 3′end of the mobile primer is complementary to any genomic (or cDNA)sequence which is to be amplified at a reasonable PCR distance from the3′ end of the immobile primer. In other words, the 3′ end of the mobileprimer is randomized. The length of the randomized 3′ sequence may be asfew as 5 nucleotides, up to 10 nucleotides or more. The second, or“specific” primers are immobilized (according to methods known in theart or described herein) to keep them from diffusing into the otherprimer pair zones while the mobile primer allows the extended product todiffuse.

[0265] There are at least two ways primer pairs may be distributed.First, two presynthesized Acrydite primers may be codeposited (Kenney etal., 1998, Biotechniques 25: 516-521; Rehman et al., 1999, Nucl. AcidsRes. 27: 649-655), along with template and polymerase, in a gel volumeelement, for example by aerosol, emulsion, or inkjet printer, from anequimolar primer mixture. Alternatively, the primers may be derived fromgenomic DNA by a localized PCR. Generic primers can be used with oneimmobilized primer to make amplified features, and then release the newextended primers by exonuclease or type II restriction enzymes asdescribed elsewhere herein. The new extended primers would then becopolymerized, along with template and polymerase, into the gel.

[0266] The process of this modified multiplex PCR method can be thoughtof as essentially two different steps. In the first, primers immobilizedin a microarray hybridize with their complementary sequence in thetemplate and are extended. In the second, and subsequent steps, the 3′(randomized) end of the mobile primers hybridizes at some point alongthe length of the extended immobilized primer and is itself extended. Insubsequent cycles, other molecules in the immobilized primer featureshybridize with the products of the previous extension, allowingextension, and so on, yielding exponential amplification as in standardPCR.

[0267] The multiplex PCR strategy need not involve replica printing.

Use

[0268] The invention is useful for generating sets each comprising aplurality of copies of a randomly-patterned, immobilized (thus highlyreusable) nucleic acid arrays from a first array upon which themolecules of a nucleic acid pool are randomly positioned quickly,inexpensively and from unique pools of nucleic acid molecules, such asbiological samples. The sets of arrays, and members of such sets,produced according to the invention are useful in expression analysis(Schena, et al., 1996, Proc. Nat. Acad. Sci. U.S.A., 93: 10614-10619;Lockhart, et al., 1996, Nature Biotechnology, 14: 1675-1680) and geneticpolymorphism detection (Chee et al., 1996, Science, 274(5287): 610-614).They are also of use in DNA/protein binding assays and more generalprotein array binding assays. The methods of the invention are alsouseful for determining the sequences of nucleic acids on arrays.

Other Embodiments

[0269] Other embodiments will be evident to those of skill in the art.It should be understood that the foregoing description is provided forclarity only and is merely exemplary. The spirit and scope of thepresent invention are not limited to the above examples, but areencompassed by the following claims.

1 9 1 17 DNA Artificial Sequence Description of Artificial SequenceT7RNA polymerase binding sequence 1 taatacgact cactata 17 2 10 DNAArtificial Sequence Description of Artificial Sequencehypotheticalsequence 2 tgcatgctat 10 3 25 DNA Artificial Sequence Description ofArtificial Sequencehypothetical sequence 3 atagcatgca atgcatttac gtagc25 4 32 DNA Artificial Sequence Description of ArtificialSequencehypothetical sequence 4 gcagcagtac gactagcata tccgacnnnn nn 32 532 DNA Artificial Sequence Description of ArtificialSequencehypothetical sequence 5 cgatagcagt agcatgcagg tccgacnnnn nn 32 666 DNA Artificial Sequence Description of ArtificialSequencehypothetical sequence 6 tcggctcatc tgcatgctgc cagcagtcggactacgtacc ccggtacgtg cgctacacgc 60 agcttt 66 7 88 DNA ArtificialSequence Description of Artificial Sequencehypothetical sequence 7gcagcagtac gactagcata tccgacctgc gtgtagcgca cgtaccgggg tacgtagtcc 60gactgctggc agcatgcaga tgagccga 88 8 94 DNA Artificial SequenceDescription of Artificial Sequencehypothetical sequence 8 cgatagcagtagcatgcagg tccgaccagc agtcggacta cgtaccccgg tacgtgcgct 60 acacgcaggtcggatatgct agtcgtactg ctgc 94 9 94 DNA Artificial Sequence Descriptionof Artificial Sequencehypothetical sequence 9 gcagcagtac gactagcatatccgacctgc gtgtagcgca cgtaccgggg tacgtagtcc 60 gactgctggt cggacctgcatgctactgct atcg 94

1. A method of making an immobilized nucleic acid molecule arraycomprising: a) providing an immobilized array of spots of a nucleic acidcapture activity wherein: i) said spots are separated by a distancegreater than the diameter of said spots; and ii) the size of said spotsis less than the diameter of the excluded volume of said nucleic acidmolecule to be captured; and b) contacting said array of spots of anucleic acid capture activity with an excess of nucleic acid moleculescapable of being bound by said nucleic acid capture activity, saidnucleic acid molecules having an excluded volume diameter greater thanthe diameter of said spots, resulting in an immobilized nucleic acidarray in which each said spot of said nucleic acid capture activity canbind only one of said nucleic acid molecules having an excluded volumegreater than the size of said spots.
 2. The method of claim 1 whereinsaid nucleic acid capture activity is selected from the group consistingof: a hydrophobic compound; an oligonucleotide; an antibody or fragmentof an antibody; a protein; a peptide; an intercalator; biotin; andavidin or streptavidin.
 3. The method of either one of claims 1 or 2wherein said immobilized array of spots of a nucleic acid captureactivity are arranged in a predetermined geometry.
 4. The method of anyone of claims 1-3 wherein said spots of nucleic acid capture activityare aligned with other microfabricated features.
 5. A method of making aplurality of a nucleic acid array wherein said nucleic acid array isproduced according to the method of either of claims 1-4.
 6. A methodfor the detection of a nucleic acid on an array of nucleic acidmolecules, said method comprising: a) generating a plurality of anucleic acid molecule array wherein the nucleic acid molecules of eachmember of said plurality occupy positions which correspond to thosepositions occupied by the nucleic acid molecules of each other member ofsaid plurality of a nucleic acid array; and b) subjecting one or moremembers of said plurality, but at least one less than the total numberof said plurality to a method of signal detection comprising a signalamplification method which renders said member of said plurality of anucleic acid array non-reusable.
 7. The method of claim 6 wherein saidsignal amplification method comprises fluorescence measurement.
 8. Themethod of either one of claims 6 or 7 wherein said method of detectionof a nucleic acid on an array of nucleic acid molecules detects theamount of an RNA expressed in a first RNA-containing nucleic acidpopulation relative to that expressed in a second RNA-containing nucleicacid population, said method further comprising the steps of: a)preparing a first fluorescently labeled cDNA population using said firstpopulation of RNA-containing nucleic acid as a template; b) preparing asecond fluorescently labeled cDNA population using said secondpopulation of RNA-containing nucleic acid as a template, said secondfluorescently labeled cDNA population being labeled with a fluorescentlabel distinguishable from that used to label said first population; c)contacting a mixture of said first fluorescently labeled cDNA populationand said second fluorescently labeled cDNA population with a member ofsaid plurality of nucleic acid arrays under conditions which permithybridization of said fluorescently labeled cDNA populations withnucleic acids immobilized on said members of said plurality of nucleicacid arrays; d) detecting the fluorescence of said first fluorescentlylabeled population of cDNA and the fluorescence of said secondfluorescently labeled population of cDNA hybridized to said member ofsaid plurality of nucleic acid arrays, wherein the relative amount ofsaid first fluorescent label and said second fluorescent label detectedon a given nucleic acid feature of said array indicates the relativelevel of expression of RNA derived from the nucleic acid of that featurein the mRNA-containing cDNA populations tested.
 9. The method of eitherof claims 6 or 7 wherein said method of detection of a nucleic acid onan array of nucleic acid molecules measures the amount of an mRNAexpressed in a first mRNA-containing nucleic acid population relative tothat expressed in a second mRNA-containing nucleic acid population, saidmethod further comprising the steps of: a) preparing a firstfluorescently labeled cDNA population using said first population ofmRNA-containing nucleic acid as a template; b) preparing a secondfluorescently labeled cDNA population using said second population ofmRNA-containing nucleic acid as a template; c) contacting said firstfluorescently labeled cDNA population with one member of a plurality ofimmobilized nucleic acid arrays under conditions which permithybridization of said fluorescently labeled cDNA population with nucleicacid immobilized on said member of a plurality of immobilized nucleicacid arrays; d) contacting said second flourescently labeled cDNApopulation with another member of the same plurality of immobilizednucleic acid arrays used in step (c) under conditions which permithybridization of said fluorescently labeled cDNA population with nucleicacid immobilized on said member of a plurality of immobilized nucleicacid arrays; e) detecting the intensity of fluorescence on each memberof said plurality contacted with a fluorescently labeled cDNA populationin steps (c)-(d); and f) comparing the intensity of fluorescencedetected in step (e) on each member of said plurality of immobilizednucleic acid arrays so tested, to determine the relative expression ofmRNA derived from those nucleic acids on the array in themRNA-containing cDNA populations tested.
 10. A method of preserving theresolution of nucleic acid features on a first immobilized array duringcycles of array replication, said method comprising the following steps:a) amplifying the features of a first array to yield an array offeatures with a hemispheric radius, r, and a cross-sectional area, q, atthe surface supporting said array, such that said features remainessentially distinct; b) contacting said array of features with aradius, r, with a support, maintained at a fixed distance from saidfirst array, said fixed distance less than r, and such that thecross-sectional area of the hemispheric feature, measured at said fixeddistance from the surface supporting said first array is less than q,and such that at least a subset of nucleic acid molecules produced bysaid amplifying are transferred to said support; c) covalently affixingsaid nucleic acid molecules to said support to form a replica of saidfirst immobilized array, wherein the positions of said nucleic acidmolecules on said replica correspond to the positions of said nucleicacid molecules of said first array from which they were amplified, andwherein the areas occupied on the surface of said support by theindividual features of said replica are less than the areas occupied onthe surface supporting said first immobilized array.
 11. The method ofclaim 10 wherein said amplifying is performed by PCR.
 12. The method ofeither of claims 10 or 11 wherein cycles of said steps (a)-(c) arerepeated.
 13. A method for determining the nucleotide sequence of thefeatures of an immobilized nucleic acid array, said method comprisingthe steps of: a) ligating a first double-stranded nucleic acid probe toone end of a nucleic acid of a feature of said array, said first doublestranded nucleic acid probe having a restriction endonucleaserecognition site for a restriction endonuclease whose cleavage site isseparate from its recognition site and which generates a protrudingstrand upon cleavage; b) identifying one or more nucleotides at the endof said polynucleotide by the identity of the first double strandednucleic acid probe ligated thereto or by extending a strand of thepolynucleotide or probe; c) amplifying the features of said array usinga primer complementary to said first double stranded nucleic acid probe,such that only molecules which have been successfully ligated with saidfirst double stranded nucleic acid probe are amplified to yield anamplified array; d) contacting said amplified array with support suchthat at least a subset of nucleic acid molecules produced by saidamplifying are transferred to said support; e) covalently attaching saidsubset of nucleic acid molecules transferred in step (d) to said supportto form a replica of said amplified array; f) cleaving the nucleic acidfeatures of the array with a nuclease recognizing said nucleaserecognition site of said probe such that the nucleic acid of thefeatures is shortened by one or more nucleotides; and g) repeating steps(a)-(f) until the nucleotide sequences of the features of said array aredetermined.
 14. The method of claim 13 wherein said nucleic acid probecomprises four components, each component being capable of indicatingthe presence of a different nucleotide in said protruding strand uponligation.
 15. The method of claim 14 wherein each of said components ofsaid probe is labeled with a different fluorescent dye and the differentfluorescent dyes are spectrally resolvable.
 16. The method of any one ofclaims 13-15 wherein after said step (e) and before said step (f), thefeatures of said array are amplified.
 17. The method of any one ofclaims 13-16 wherein amplification is performed by PCR.
 18. The methodof any one of claims 13-17 wherein: i) after one or more cycles usingsaid first double stranded nucleic acid probe in step (a), a distinctnucleic acid probe is used, in place of said first double strandednucleic probe in step (a), said distinct nucleic acid probe comprising arestriction endonuclease recognition site for a restriction endonucleasewhose cleavage site is separated from its recognition site, saiddistinct nucleic acid probe also comprising sequences such that a primercomplementary to said distinct nucleic acid probe will not hybridizewith said first double stranded nucleic acid probe; and ii) a primercomplementary to said distinct nucleic acid probe is used in place ofsaid primer complementary to said first double stranded nucleic acidprobe in step (c), so that selective amplification of those featureswhich successfully completed the previous cycle of restriction andligation occurs.
 19. The method of claim 18 wherein a new distinctnucleic acid probe is used after each cycle of restriction and ligation,said new distinct nucleic acid probe comprising a sequence such that aprimer complementary to that sequence will not hybridize to any probeused in previous cycles.
 20. A method of determining the nucleotidesequence of the features of an array of immobilized nucleic acidscomprising the steps of: a) adding a mixture comprising anoligonucleotide primer and a template-dependent polymerase to an arrayof immobilized nucleic acid features under conditions permittinghybridization of the primer to the immobilized nucleic acids; b) addinga single, fluorescently labeled deoxynucleoside triphosphate to themixture under conditions which permit incorporation of the labeleddeoxynucleotide onto the 3′ end of the primer if it is complementary tothe next adjacent base in the sequence to be determined; c) detectingincorporated label by monitoring fluorescence; d) repeating steps(b)-(c) with each of the remaining three labeled deoxynucleosidetriphosphates in turn; and e) repeating steps (b)-(d) until thenucleotide sequence is determined.
 21. The method of claim 20 whereinthe primer, buffer and polymerase are cast into a polyacrylamide gelbearing the array of immobilized nucleic acids.
 22. The method of eitherof claims 20 or 21 wherein said single fluorescently labeleddeoxynucleotide further comprises a mixture of the singledeoxynucleoside triphosphate in labeled and unlabeled forms.
 23. Themethod of any one of claims 20-22 wherein after step (d) and before step(e) the additional step of photobleaching said array is performed. 24.The method of any one of claims 20-23 wherein said fluorescently labeleddeoxynucleoside triphosphates are labeled with a cleavable linkage tothe fluorophore.
 25. The method of claim 24 wherein after step (d) andbefore step (e) the additional step of cleaving said linkage to thefluorophore is performed.
 26. The method of any one of claims 20-25wherein said oligonucleotide primer comprises sequences permittingformation of a hairpin loop.
 27. The method of any one of claims 20-26wherein after a predetermined number of cycles of steps (b)-(d), adefined regimen of deoxynucleotide and chain-terminating deoxynucleotideanalog addition is performed, such that out-of-phase molecules areblocked from further extension cycles, said regimen followed bycontinued cycles of steps (b)-(d) until said nucleotide sequence isdetermined.
 28. A method of determining the nucleotide sequence of thefeatures of an array of immobilized nucleic acids comprising the stepsof: a) adding a mixture comprising an oligonucleotide primer and atemplate-dependent polymerase to an array of immobilized nucleic acidfeatures under conditions permitting hybridization of the primer to theimmobilized nucleic acids; b) adding a first mixture of three unlabeleddeoxynucleoside triphosphates under conditions which permitincorporation of deoxynucleotides to the end of the primer if they arecomplementary to the next adjacent base in the sequence to bedetermined; c) adding a second mixture of three unlabeleddeoxynucleoside triphosphates, said second mixture comprising thedeoxynucleoside triphosphate not included in the mixture of step (b),under conditions which permit incorporation of deoxynucleotides to theend of the primer if they are complementary to the next adjacent base inthe sequence to be determined; d) repeating steps (b)-(c) for apredetermined number of cycles; e) adding a single, fluorescentlylabeled deoxynucleoside triphosphate to the mixture under conditionswhich permit incorporation of the labeled deoxynucleotide onto the 3′terminus of the primer if it is complementary to the next adjacent basein the sequence to be determined; f) detecting incorporated label bymonitoring fluorescence; g) repeating steps (e)-(f), with each of theremaining three labeled deoxynucleoside triphosphates in turn; and h)repeating steps (e) -(g) until the nucleotide sequence is determined.29. The method of claim 28 wherein for said first or second mixtures ofthree unlabeled deoxynucleoside triphosphates, a mixture which comprisesdeoxyguanosine triphosphate further comprises deoxyadenosinetriphosphate.
 30. The method of either of claims 28 or 29 wherein theprimer and polymerase are cast into a polyacrylamide gel bearing thearray of immobilized nucleic acids.
 31. The method of any one of claims28-30 wherein said single fluorescently labeled deoxynucleotide furthercomprises a mixture of the single deoxynucleoside triphosphate inlabeled and unlabeled forms.
 32. The method of any one of claims 28-31wherein after step (g) and before step (h) the additional step ofphotobleaching said array is performed.
 33. The method of any one ofclaims 28-32 wherein said fluorescently labeled deoxynucleosidetriphosphates are labeled with a cleavable linkage to the fluorophore.34. The method of claim 33 wherein after step (g) and before step (h)the additional step of cleaving said linkage to the fluorophore isperformed.
 35. The method of any one of claims 28-34 wherein saidoligonucleotide primer comprises sequences permitting formation of ahairpin loop.
 36. The method of any one of claims 28-35 wherein after apredetermined number of cycles of steps (e)-(g), a defined regimen ofdeoxynucleotide and chain-terminating deoxynucleotide analog addition isperformed, such that out-of-phase molecules are blocked from furtherextension cycles, said regimen followed by continued cycles of steps(e)-(g) until said nucleotide sequence of the features of the array isdetermined.
 37. A method of determining the nucleotide sequence of thefeatures of a micro-array of nucleic acid molecules, said methodcomprising the following steps: a) creating a micro-array of nucleicacid features in a linear arrangement within and along one side of apolyacrylamide gel, said gel further comprising one or moreoligonucleotide primers, and a template-dependent polymerizing activity;b) amplifying the microarray of step (a); c) adding a mixture ofdeoxynucleoside triphosphates, said mixture comprising each of the fourdeoxynucleoside triphosphates dATP, dGTP, dCTP and dTTP, said mixturefurther comprising chain-terminating analogs of each of thedeoxynucleoside triphosphates dATP, dGTP, dCTP and dTTP, and saidchain-terminating analogs each distinguishably labeled with a spectrallydistinguishable fluorescent moiety; d) incubating said mixture with saidmicro-array under conditions permitting extension of said one or moreoligonucleotide primers; e) electrophoretically separating the productsof said extension within said polyacrylamide gel; and f) determining thenucleotide sequence of the features of said micro-array by detecting thefluorescence of the extended, terminated and separated reaction productswithin the gel.
 38. The method of claim 37 wherein said amplifying isperformed by PCR.
 39. The method of claim 37 wherein said amplifying isperformed by an isothermal method.
 40. The method of any of claims 37-39wherein said microarray of nucleic acid features in a linear arrangementis derived as a replica of features arranged on a chromosome.
 41. Themethod of any one of claims 37-39 wherein said micro-array of nucleicacid features in a linear arrangement is derived as a replica of onelinear subset of features on a separate, non-linear micro-array ofnucleic acid features.
 42. A method of simultaneously amplifying aplurality of nucleic acids, said method comprising the steps of: a)creating a micro-array of immobilized oligonucleotide primers; b)incubating the microarray of step (a) with amplification template and anon-immobilized oligonucleotide primer under conditions allowinghybridization of said template with said oligonucleotide primers; c)incubating the hybridized primers and template of step (b) with a DNApolymerase activity, and deoxynucleotide triphosphates under conditionspermitting extension of the primers; d) repeating steps (b) and (c) fora defined number of cycles to yield a plurality of amplified DNAmolecules.
 43. The method of claim 42 wherein said non-immobilizedoligonucleotide primer comprises a pool of oligonucleotide primerscomprised of 5′ and 3′ sequence elements, said 5′ sequence elementidentical in all members of said pool, and said 3′ sequence elementcontaining random sequences.
 44. The method of claim 43 wherein said 5′sequence element comprises a restriction endonuclease recognitionsequence.
 45. The method of either of claims 43 or 44 wherein said 5′element comprises a transcriptional promoter sequence.
 46. The method ofeither of claims 42 or 43 wherein said immobilized primers are amplifiedbefore step (b).
 47. The method of any one of claims 42-44 wherein saidimmobilized oligonucleotide primers are generated from genomic DNA. 48.The method of any one of claims 42-47 wherein the microarray, template,non-immobilized primer, and polymerase are cast in a polyacrylamide gel.